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NUCLEIC ACID-ASSOCIATED PROTEINS 
TECHNICAL FIELD 

The invention relates to novel nucleic acids, nucleic acid-associated proteins encoded by these 
5 nucleic acids, and to the use of these nucleic acids and proteins in the diagnosis, treatment, and 

prevention of cell proliferative, neurological, reproductive, developmental, autoimmune/inflanimatory, 
and DNA repair disorders, and infections. The invention also relates to the assessment of the effects 
of exogenous compounds on the expression of nucleic acids and nucleic acid-associated proteins. 

10 BACKGROUND OF THE INVENTION 

Multicellular organisms are comprised of diverse cell types that differ dramatically both in 
structure and function. The identity of a cell is determined by its characteristic pattern of gene 
expression, and different cell typ^B express overlapping but distinctive sets of genes throughout 
development Spatial and temporal regulation of gene expression is critical for the control of cell 

15 proliferation, cell differentiation* apoptosis, and other processes that contribute to organismal 
development Furthermore, gene expression is regulated in response to extracellular signals that 
mediate cell-cell communication and coordinate the activities of different cell types. Appropriate gene 
regulation also ensures that cells function efficiently by expressing only those genes whose functions 
are required at a given time. 

20 The cell nucleus contains all of the genetic information of the cell in the form of DNA, rid the 

components and machinery necessary for replication of DNA and for transcription of DNA in* 
RNA. (See Alberts, B. et al. (1994) Molecular Biology of the Cell , Garland Publishing Inc. New York 
NY, pp. 335-399.) DNA is organized into compact structures in the nucleus by interactions with 
various DNA-binding proteins such as histones and non-histone chromosomal proteins. 

25 DNA-specific nucleases, DNAses, partially degrade these compacted structures prior to DNA 

replication or transcription. DNA replication takes place with the aid of DNA helicases which unwind 
the double-stranded DNA helix, and DNA polymerases that duplicate the separated DNA strands. 
Transcription Factors 

Transcriptional regulatory proteins are essential for the control of gene expression. Some of 

30 these proteins function as transcription factors that initiate, activate, repress, or terminate gene 

transcription. Transcription factors generally bind to the promoter, enhancer, and upstream regulatory 
regions of a gene in a sequence-specific manner, although some factors bind regulatory elements 
within or downstream of a gene coding region. Transcription factors may bind to a specific region of 
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DNA singly or as a complex with other accessory factors (reviewed in Lewin, B. (1990) Genes IV . 

Oxford University Press, New York NY, and Cell Press, Cambridge MA, pp. 554-570). 

The double helix structure and repeated sequences of DNA create topological and chemical 

features which can be recognized by transcription factors. These features are hydrogen bond donor 
5 and acceptor groups, hydrophobic patches, major and minor grooves, and regular, repeated stretches 

of sequence which induce distinct bends in the helix. Typically, transcription factors recognize specific 

DNA sequence motifs of about 20 nucleotides in length. Multiple, adjacent transcription factor-binding 

motifs may be required for gene regulation. 

Many transcription factors incorporate DNA-binding structural motifs which comprise either a 
10 helices or £ sheets that bind to the major groove of DNA. Four well-characterized structural motifs 

are helix-turn-helix, zinc finger, leucine zipper, and helix-loop-helix. Proteins containing these motifs 

may act alone as monomers, or they may form homo- or heterodimers that interact with DNA. 

The helix-turn-helix motif consists of two a helices connected at a fixed angle by a short 

chain of amino acids. One of the helices binds to the major groove. Helix-turn-helix motifs are 
15 exemplified by the homeobox motif which is present in homeodomain proteins. These proteins are 

critical for specifying the anterior-posterior body axis during development and are conserved 

throughout the animal kingdom. The Antennapedia and Ultrabithorax proteins of Drosophila 

melanogaster are prototypical homeodomain proteins (Pabo, CO. and R.T. Sauer (1992) Annu. Rev. 

Biochem. 61:1053-1095). 

20 Homeobox genes are a family of highly conserved regulatory genes that encode transcription 

factors. They are essential during embryonic development. They are important in limb formation and 
reproductive tract development. They function in uterine receptivity and implantation in mice and 
probably serve a similar role in humans (Daftary, G. S. and Taylor, H. S. (2000) Semin. Reprod. Med. 
18:311-320). Homeobox gene mutations play a role in susceptibility to autism (Ingram, J. L. et aL 

25 (2000) Teratology 62:393-405) and are implicated in human diseases, such as diabetes to cancer (Cillo, 
C. et al. (2001) J. Cell Physiol. 188:161-169). 

The helix-loop-helix motif (HLH) consists of a short a helix connected by a loop to a longer a 
helix. The loop is flexible and allows the two helices to fold back against each other and to bind to 
DNA. The protooncogene Myc, a transcription factor that activates genes required for cellular 

30 proliferation, contains a prototypical HLH motif. 

The zinc finger motif, which binds zinc ions, generally contains tandem repeats of about 30 
amino acids consisting of periodically spaced cysteine and histidine residues. Examples of this 
sequence pattern, designated C2H2 and C3HC4 ("RING" finger), have been described (Lewin, 
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supra). Zinc finger proteins each contain an cc helix and an antiparallel 6 sheet whose proximity and 
conformation are maintained by the zinc ion Contact with DNA is made by the arginine preceding 
Ihe a helix and by the second, third, and sixth residues of the a helix. Variants of the zinc finger motif 
include poorly defined cysteine-rich motifs which bind zinc or other metal ions. These motifs may not 
contain histidine residues and are generally nonrepetitive. The zinc finger motif maybe repeated in a 
tandem array within a protein, such that the a helix of each zinc finger in the protein makes contact 
with the major groove of the DNA double helix This repeated contact between 1he protein and the 
DNA produces a strong and specific DNA-protein interaction. The strength and specificity of the 
interaction can be regulated by the number of zinc finger motifs within the protein. Though originally 
identified in DNA-binding proteins as regions that interact directly with DNA, zinc fingers occur in a 
variety of proteins that do not bind DNA (Lodish, H. et al. (1995) Molecular Cell Biology, Scientific 
American Books, New York NY, pp. 447-451). For example, Galcheva-Gargova et al. (1996; 
Science 272:1797-1802) have identified zinc finger proteins that interact with various cytokine 
receptors. 

The C2H2-type zinc finger signature motif contains a 28 amino acid sequence, including 2 
conserved Cys and 2 conserved His residues in a C-2-C-12-H-3-H type motif. The motif generally 
occurs in multiple tandem repeats. A cysteine-rich domain including the motif Asp-His-His-Cys 
(DHHC-CRD) has been identified as a distinct subgroup of zinc finger proteins. The DHHC-CRD 
region has been implicated in growth and development One DHHC-CRD mutant shows defective 
function of Ras, a small membrane-associated GTP-binding protein that regulates cell growth and 
differentiation, while other DHHC-CRD proteins probably function in pathways not involving Ras 
(Bartels, D.J. et al. (1999) Mol. Cell Biol. 19:6775-6787). 

Zinc-finger transcription factors are often accompanied by modular sequence motifs such as 
the Kruppel-associated box (KRAB) and the SCAN domain For example, the 
hypoalphalipoproteinemia susceptibility gene ZNF202 encodes a SCAN box and a KRAB domain 
followed by eight C2H2 zinc-finger motifs (Honer, C. et al. (2001) Biochim. Biophys. Acta 
1517:441-448). The SCAN domain is a highly conserved, leucine-rich motif of approximately 60 
amino acids found at the annno-terminal end of zinc finger transcription factors. SCAN domains are 
most often linked to C2H2 zinc finger motifs through their carboxyl-terminal end. Biochemical binding 
studies have established the SCAN domain as a selective hetero- and homotypic oligomerization 
domain SCAN domain-mediated protein complexes may function to modulate the biological function 
of transcription factors (Schumacher, C. et al (2000) J. Biol. Chem 275:17173-17179). 

The KRAB (Kruppel-associated box) domain is a conserved amino acid sequence spanning 

3 



03OOO864A2 1 > 



WO 03/000864 



approximately 75 amino acids and is found in almost one-third of the 300 to 700 genes encoding C2H2 
zinc fingers. The KRAB domain is found N-terminally with respect to the finger repeats. The KRAB 
domain is generally encoded by two exons; the KRAB-A region or box is encoded by one exon and 
the KRAB-B region or box is encoded by a second exon. The function of the KRAB domain is the 

5 repression of transcription. Transcription repression is accomplished by recruitment of either the 
KRAB-associated protein-1, a transcriptional corepressor, or the KRAB-A interacting protein 
Proteins containing the KRAB domain are likely to play a regulatory role during development 
(Williams, A.J. et al. (1999) Mol. Cell Biol. 19:8526-8535). A subgroup of higjily related human 
KRAB zinc finger proteins detectable in all human tissues is highly expressed inhuman T lymphoid 

10 cells (Bellefroid, EJ. et al. (1993) EMBO J. 12:1363-1374). The ZNF85 KRAB zinc finger gene, a 
member of the human ZNF9 1 family, is highly expressed in normal adult testis, in seminomas, and in 
the NT2/D1 teratocarcinoma cell line (Poncelet, D.A. et al. (1998) DNA Cell Biol. 17:93 1-943). 

The C4 motif is found in hormone-regulated proteins. The C4 motif generally includes only 2 
repeats. A number of eukaryotic and viral proteins contain a conserved cysteine-rich domain of 40 

15 to 60 residues (called C3HC4 zinc-finger or RING finger) that binds two atoms of zinc, and is 

probably involved in mediating protein-protein interactions. The 3D "cross-brace" structure of the zinc 
ligation system is unique to the RING domain. The spacing of the cysteines in such a domain is 
C-x(2)-C-x(9 to 39)-C-x(l to 3)-H-x(2 to3)-C-x(2>C-x(4 to 48)-C-x(2)-C. The PHD finger is a 
C4HC3 zinc-finger-like motif found in nuclear proteins thought to be involved in chromatin-mediated 

20 transcriptional regulation. 

GATA-type transcription factors contain one or two zinc finger domains which bind 
specifically to a region of DNA that contains the consecutive nucleotide sequence GATA. NMR 
studies indicate that the zinc finger comprises two irregular anti-parallel (3 sheets and an a helix, 
followed by a long loop to the C-terminal end of the finger (OminchihsM, J.G. (1993) Science 

25 261:438-446). The helix and the loop connecting the two (3-sheets contact the major groove of the 
DNA, while the C-terminal part, which determines the specificity of binding, wraps around into the 
minor groove. 

The TJM motif consists of about 60 amino acid residues and contains seven conserved 
cysteine residues and a histidine within a consensus sequence (Schmeichel, K.L. and M.C. Beckerle 
30 (1994) Cell 79:211-219). The TJM family includes transcription factors and cytoskeletal proteins 

which may be involved in development, differentiation, and cell growth. One example is actin-binding 
LIM protein, which may play roles in regulation of the cytoskeleton and cellular morphogenesis (Roof, 
D J. et al. (1997) J. Cell Biol. 138:575-588). The N-terminal domain of actin-binding LIM protein has 
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four double zinc finger motifs with the LM consensus sequence. The C-tenninal domain of actin- 
binding T TM protein shows sequence similarity to known actin-binding proteins such as dematin and 
villin. Actin-binding T TM protein binds to F-actin through its dematin-like C-terminal domain. The 
T TM domain may mediate protein-protein interactions with other LIM-binding proteins. 

5 Myeloid cell development is controlled by tissue-specific transcription factors. Myeloid zinc 

finger proteins (MZF) include MZF-1 and MZF-2. MZF-1 functions in regulation of the development 
of neutrophilic granulocytes. A murine homolog MZF-2 is expressed in myeloid cells, particularly in 
the cells committed to the neutrophilic lineage. MZF-2 is down-regulated by G-CSF and appears to 
have a unique function in neutrophil development (Murai, K. et aL (1997) Genes Cells 2:581-591). 

10 The leucine zipper motif comprises a stretch of amino acids rich in leucine which can form an 

amphipathic a helix. This structure provides the basis for dimerization of two leucine zipper proteins. 
The region adjacent to the leucine zipper is usually basic, and upon protein dimerization, is optimally 
positioned for binding to the major groove. Proteins containing such motifs are generally referred to as 
bZBP transcription factors. The leucine zipper motif is found in the proto-oncogenes Fos and Jun, 

15 which comprise the heterodimeric transcription factor API involved in cell growth and the 
determination of cell lineage (Papavassiliou, A.G. (1995) N. Engl. J. Med. 332:45-47). 

The helix-loop-helix motif (HLH) consists of a short a helix connected by a loop to a longer a 
helix. The loop is flexible and allows die two helices to fold back against each other and to bind to 
DNA. The transcription factor Myc contains a prototypical HLH motif. 

20 The NF-kappa-B/Rel signature defines a family of eukaryotic transcription factors involved in 

oncogenesis, embryonic development, differentiation and immune response. Most transcription factors 
containing the Rel homology domain (RHD) bind as diruers to a consensus DNA sequence motif 
termed kappa-B. Members of the Rel family share a highly conserved 300 amino acid domain termed 
the Rel homology domain. The characteristic Rel C-terminal domain is involved in gene activation and 

25 cytoplasmic anchoring functions. Proteins known to contain the RHD domain include vertebrate 
nuclear factor NF-kappa-B, which is a heterodimer of a DNA-binding subunit and the transcription 
factor p65, mammalian transcription factor RelB, and vertebrate proto-oncogene c-rel, a protein 
associated with differentiation and lymphopoiesis (Kabrun, N. and P.J. Enrietto (1994) Semin. Cancer 
BioL 5:103-112). 

30 A DNA binding motif termed ARID (AT-rich interactive domain) distinguishes an 

evolutionary conserved family of proteins. The approximately 100-residue ARID sequence is present 
in a series of proteins strongly implicated in the regulation of cell growth, development, and 
tissue-specific gene expression. ARID proteins include Bright (a regulator of B-cell-specific gene 
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expression), dead ringer (involved in development), and MRF-2 (which represses expression from the 
cytomegalovirus enhancer) (Dallas, RB. et al. (2000) MoL Cell. Biol. 20:3137-3146). 

The ELM2 (Egl-27 and MTA1 homology 2) domain is found in metastasis-associated protein 
MTA1 and protein ER1. The Caenorhabditis elegans gene egl-27 is required for embryonic 
5 patterning MTA1, a human gene with elevated expression in metastatic carcinomas, is a component 
of a protein complex withhistone deacetylase and nucleosome remodelling activities (Solari, F. et al. 
(1999) Development 126:2483-2494). The ELM2 domain is usually found to the N terminus of a 
myb-like DNA binding domain. ELM2 is also found associated with an ARID DNA. 

The Iroquois (Irx) family of genes are found in nematodes, insects and vertebrates. Irx genes 
10 usually occur in one or two genomic clusters of three genes each and encode transcriptional 

controllers that possess a characteristic homeodomain. The Irx genes function early in development to 
specify the identity of diverse territories of the body. Later in development in both Drosophila and 
vertebrates, the Irx genes function again to subdivide those territories into smaller domains (reviewed 
in Cavodeassi, F. et al. (2001) Development 128:2847-2855). For example, mouse and human Irx4 
15 proteins are 83% conserved and their 63-aa homeodomain is more than 93% identical to that of the 
Drosophila Iroquois patterning genes. Irx4 transcripts are predominantly expressed in the cardiac 
ventricles. The homeobox gene Irx4 mediates ventricular differentiation during cardiac development 
(Bruneau, B.G. et al. (2000) Dev. BioL 217:266-77). 

Histidine triad (HIT) proteins share residues in distinctive dimeric, 10-stranded half-barrel 
20 structures that form two identical purine nucleotide-binding sites. Hint (histidine triad 

nucleotide-binding protein)-related proteins, found in all forms of life, and fragile histidine triad 
(Fhit)-related proteins, found in animals and fungi, represent the two main branches of the HIT 
superfamily. Fhit homologs bind and cleave diadenosine polyphosphates. Fhit-Ap(n)A complexes 
appear to function in a proapoptotic tumor suppression pathway in epithelial tissues (Brenner C. et al 
25 (1999) J. Cell Physiol.181.179-187). 

Most transcription factors contain characteristic DNA binding motifs, and variations on the 
above motifs and new motifs have been and are currently being characterized (Faisst, S. and S. 
Meyer (1992) Nucleic Acids Res. 20:3-26). These include the forkhead motif, found in transcription 
factors involved in development and oncogenesis (Hacker et al. (1995) EMBO J 14:5306-5317), and 
30 the T-box protein T-domain, which forms a novel major and minor groove DNA contact. T-box genes 
such as Brachyury (T) are essential for tissue specification in development (Muller (1997) Nature 
389:884-888.) 

Chromatin Associated Proteins 
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In the nucleus, DNA is packaged into chromatin, the compact organization of which limits the 
accessibility of DNA to transcription factors and plays a key role in gene regulation (Lewin, supra, 
pp. 409-410). The compact structure of chromatin is determined and influenced by chromatin- 
associated proteins such as the histones, the high mobility group (HMG) proteins, and the 
5 chromodomain proteins. There are five classes of histones, HI, H2A, H2B, H3, and H4, all of which 
are highly basic, low molecular weight proteins. The fundamental unit of chromatin, the nucleosome, 
consists of 200 base pairs of DNA associated with two copies each of H2A, H2B, H3, and H4. HI 
links adjacent nucleosomes. HMG proteins are low molecular weight, non-histone proteins that may 
play a role in unwinding DNA and stabilizing single-stranded DNA. Chromodomain proteins play a 
10 key role in the formation of highly compacted heterochromatin, which is transcriptionally silent 
Diseases and Disorders Related to G ene Regulation 

Mutations in transcription factors contribute to oncogenesis. This is likely due to the role of 
transcription factors in the expression of genes involved in cell proliferation. For example, mutations in 
transcription factors encoded by proto-oncogenes, such as Fos, Jun, Myc, Rel, and Spil, may be 
15 oncogenic due to increased stimulation of cell proliferation. Conversely, mutations in transcription 
factors encoded by tumor suppressor genes, such as p53, RBI, and WT1, maybe oncogenic due to 
decreased inhibition of cell proliferation. (Latchman, D. (1995) Gene Regulation: A Eukaryotic 
Perspective , Chapman and Hall, London, UK, pp 242-255.) 

Many neoplastic disorders in humans can be attributed to inappropriate gene expression. 
20 Malignant cell growth may result from either excessive expression of tumor promoting genes or 

insufficient expression of tumor suppressor genes (Cleary, MX. (1992) Cancer Surv. 15:89-104). The 
zinc finger-type transcriptional regulator WT1 is a tumor-suppressor protein that is inactivated in 
children with Wilm's tumor. Deletions of the WT1 gene, or point mutations which destroy the 
DNA-binding activity of the protein, are associated with development of the pediatric nephroblastoma, 
25 Wilms tumor, and Denys-Drash syndrome. (Rauscher, F.J. (1993) FASEB J. 7:896-903.) The 

oncogene bcl-6, which plays an important role in large-cell lymphoma, is also a zinc-finger protein 
(Papavassiliou, A.G. (1995) N. Engl. J. Med. 332:45-47). Chromosomal translocations may also 
produce chimeric loci that fuse the coding sequence of one gene with the regulatory regions of a 
second unrelated gene. Such an arrangement likely results in inappropriate gene transcription, 
30 potentially contributing to malignancy. In BurMtt's lymphoma, for example, the transcription factor 
Myc is translocated to the immunoglobulin heavy chain locus, greatly enhancing Myc expression and 
resulting in rapid cell growth leading to leukemia (Latchman, D.S. (1996) N. EngL J. Med. 334:28-33). 
Certain proteins enriched in ghitamine are associated with various neurological disorders 
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including spinocerebellar ataxia, bipolar effective disorder, schizophrenia, and autism. (Margolis, R.L. 
et al. (1997) Human Genetics 100:114-122.) These proteins contain regions with as many as 15 or 
more consecutive glutamine residues and may function as transcription factors with a potential role in 
regulation of neurodevelopment or neuroplasticity. 

5 In addition, the immune system responds to infection or trauma by activating a cascade of 

events that coordinate the progressive selection, amplification, and mobilization of cellular defense 
mechanisms. A complex and balanced program of gene activation and repression is involved in this 
process. However, hyperactivity of the immune system as a result of improper or insufficient 
regulation of gene expression may result in considerable tissue or organ damage. This damage is well- 

10 documented in immunological responses associated with arthritis, allergens, heart attack, stroke, and 
infections (Isselbacher, KJ. et al. Harrison's Principles of Internal Medicine . 13/e, McGraw Hill, Inc. 
and Teton Data Systems Software, 1996). In particular, a zinc finger protein termed Staf50 (for 
Stimulated trans-acting factor of 50 kDa) is a transcriptional regulator and is induced in various cell 
lines by interferon-I and -II. Staf50 appears to mediate the antiviral activity of interferon by 

15 down-regulating the viral transcription directed by the long terminal repeat promoter region of human 
immunodeficiency virus type-1 in transfected cells. (Tissot, C. (1995) J. Biol. Chem. 
270:14891-14898.) Also, the causative gene for autoimmune polyendocrinopathy-candidiasis- 
ectodermal dystrophy (APECED) was recently isolated and found to encode a protein with two PHD- 
type zinc finger motifs (Bjorses, P. et aL (1998) Hum. MoL Genet. 7:1547-1553). 

20 Furthermore, the generation of multicellular organisms is based upon the induction and 

coordination of cell differentiation at the appropriate stages of development Central to this process is 
differential gene expression, which confers the distinct identities of cells and tissues throughout the 
body. Failure to regulate gene expression during development could result in developmental disorders. 
Human developmental disorders caused by mutations in zinc finger-type transcriptional regulators 

25 include: urogenital developmental abnormalities associated with WT1; Greig cephalopolysyndactyly, 
Pallister-Hall syndrome, and postaxial Polydactyly type A (GLI3), and Townes-Brocks syndrome, 
characterized by anal, renal, limb, and ear abnormalities (SAT 1,1) (Engelkamp, D. and V. van 
Heyningen (1996) Curr. Opin. Genet. Dev. 6:334-342; Kohlhase, J. et al. (1999) Am. J. Hum. Genet 
64:435-445). 

30 Human acute leukemias involve reciprocal chromosome translocations that fuse the ALL-1 

gene located at chromosome region llq23 to a series of partner genes positioned on a variety of 
human chromosomes. The fused genes encode chimeric proteins. The AF17 gene encodes a protein 
of 1093 amino acids, containing a leucine-zipper dimerization motif located 3 1 of the fusion point and a 
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cysteine-rich domain at the N terminus that shows homology to a domain within the protein Brl40 
(peregrin) (Prasad R. et al (1994) Proc. Natl. Acad. Sci. USA 91:8107-8111). 

Impaired transcriptional regulation may lead to Alzheimer's disease, a progressive 
neurodegenerative disorder that is characterized by the formation of senile plaques and neurofibrillary 

5 tangles containing amyloid beta peptide. These plaques are found in limbic and association cortices of 
the brain, including hippocampus, temporal cortices, cingulate cortex, amygdala, nucleus basalis and 
locus caeruleus. Early in Alzheimer's pathology, physiological changes are visible in the cingulate 
cortex (Minoshima, S. et aL (1997) Ann. NeuroL 42:85-94). In subjects with advanced Alzheimer's 
disease, accumulating plaques damage the neuronal architecture in limbic areas and eventually cripple 

10 the memory process. 

SYNTHESIS OF NUCLEIC ACIDS 
Polymerases 

DNA and RNA replication are critical processes for cell replication and function. DNA and 
RNA replication are mediated by the enzymes DNA and RNA polymerase, respectively, by a 

15 *templating" process in which the nucleotide sequence of a DNA or RNA strand is copied by 

complementary base-pairing into a complementary nucleic acid sequence of either DNA or RNA. 
However, there are fundamental differences between the two processes. 

DNA polymerase catalyzes the stepwise addition of a deoxyribonucleotide to the 3 -OH end 
of a polynucleotide strand (the primer strand) that is paired to a second (template) strand. The new 

20 DNA strand therefore grows in the 5 f to 3' direction (Alberts, B. et al. (1994) The Molecular Biology 
of the Cen , Garland Publishing Inc., New York NY, pp 251-254). The substrates for the 
polymerization reaction are the corresponding deoxynucleotide triphosphates which must base-pair 
with the correct nucleotide on the template strand in order to be recognized by the polymerase. 
Because DNA exists as a double-stranded helix, each of the two strands may serve as a template for 

25 the formation of a new complementary strand. Each of the two daughter cells of a dividing cell 

therefore inherits a new DNA double helix containing one old and one new strand. Thus, DNA is said 
to be replicated "semiconservatively" by DNA polymerase. In addition to the synthesis of new DNA, 
DNA polymerase is also involved in the repair of damaged DNA as discussed below under "Ligases." 
In contrast to DNA polymerase, RNA polymerase uses a DNA template strand to 

30 'transcribe" DNA into RNA using ribonucleotide triphosphates as substrates. Like DNA 

polymerization, RNA polymerization proceeds in a 5' to 3 1 direction by addition of a ribonucleoside 
monophosphate to the 3 -OH end of a growing RNA chain. DNA transcription generates messenger 
RNAs (mRNA) that carry information for protein synthesis, as well as the transfer, ribosomal, and 
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other RNAs that have structural or catalytic functions. In eukaryotes, three discrete RNA 
polymerases synthesize the three different types of RNA (Alberts, supra, pp. 367-368). RNA 
polymerase I makes the large ribosomal RNAs, RNA polymerase E makes the mRNAs that will be 
translated into proteins, and RNA polymerase HI makes a variety of small, stable RNAs, including 5S 
5 ribosomal RNA and the transfer RNAs (tRNA). In all cases, RNA synthesis is initiated by binding of 
the RNA polymerase to a promoter region on the DNA and synthesis begins at a start site within the 
promoter. Synthesis is completed at a stop (termination) signal in the DNA whereupon both the 
polymerase and the completed RNA chain are released. 
Ligases 

10 DNA repair is the process by which accidental base changes, such as those produced by 

oxidative damage, hydrolytic attack, or uncontrolled methylation of DNA, are corrected before 
replication or transcription of the DNA can occur. Because of the efficiency of the DNA repair 
process, fewer than one in a thousand accidental base changes causes a mutation (Alberts, supra, pp. 
245-249). The three steps common to most types of DNA repair are (1) excision of the damaged or 

15 altered base or nucleotide by DNA nucleases, (2) insertion of the correct nucleotide in the gap left by 
the excised nucleotide by DNA polymerase using the complementary strand as the template and, (3) 
sealing the break left between the inserted nucleotide(s) and the existing DNA strand by DNA ligase. 
In the last reaction, DNA ligase uses the energy from ATP hydrolysis to activate the 5* end of the 
broken phosphodiester bond before forming the new bond with the 3 -OH of the DNA strand. In 

20 Bloom's syndrome, an inherited human disease, individuals are partially deficient in DNA ligation and 
consequently have an increased incidence of cancer (Alberts, supra, p. 247). 
Nucleases 

Nucleases comprise enzymes that hydrolyze both DNA (DNase) and RNA (Rnase). They 
serve different purposes in nucleic acid metabolism. Nucleases hydrolyze the phosphodiester bonds 

25 between adjacent nucleotides either at internal positions (endonucleases) or at the terminal 3* or 5' 
nucleotide positions (exonucleases). A DNA exonuclease activity in DNA polymerase, for example, 
serves to remove improperly paired nucleotides attached to the 3 -OH end of the growing DNA strand 
by the polymerase and thereby serves a "proofreading" function. As mentioned above, DNA 
endonuclease activity is involved in the excision step of the DNA repair process. 

30 RNases also serve a variety of functions. For example, RNase P is a ribonucleoprotein 

enzyme which cleaves the 5' end of pre-tRNAs as part of their maturation process. RNase H digests 
the RNA strand of an RNA/DNA hybrid. Such hybrids occur in cells invaded by retroviruses, and 
RNase H is an important enzyme in the retroviral replication cycle. Pancreatic RNase secreted by 
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the pancreas into the intestine hydrolyzes RNA present in ingested foods. RNase activity in serum 
and cell extracts is elevated in a variety of cancers and infectious diseases (Schein, C.H. (1997) Nat. 
Biotechnol. 15:529-536). Regulation of RNase activity is being investigated as a means to control 
tumor angiogenesis, allergic reactions, viral infection and replication, and fungal infections. 
5 MODIFICATION OF NUCLEIC ACIDS 
DNA Repair 

Cells are constantly faced with replication errors and environmental assault (such as 
ultraviolet irradiation) that can produce DNA damage. Damage to DNA consists of any change that 
modifies the structure of the molecule. Changes to DNA can be divided into two general classes, 
10 single base changes and structural distortions. Single base changes affect the sequence but not the 
overall structure of the DNA. Since single base changes do not affect transcription or replication, 
they exert their effect on future generations. Structural distortions affect the structure of the DNA. 
A single strand nick or removal of a base may prevent a strand from acting as a viable template for 
synthesis of DNA or RNA. Intrastrand or interstrand covalent linkage between bases, or the addition 
15 of a bulky adduct to a base, may distort the structure of the double helix and interfere with 

transcription and replication. Any damage to DNA can produce a mutation, and the mutation may 
produce a disorder, such as cancer. 

Changes in DNA are recognized by repair systems within the cell. These repair systems act 
to correct the damage and thus prevent any deleterious affects of a mutational event Repair systems 
20 can be divided into three general types, direct repair, excision repair, and retrieval systems. When the 
repair systems are eliminated, cells become exceedingly sensitive to environmental mutagens, such as 
ultraviolet irradiation. Disorders associated with a loss in DNA repair systems often exhibit a high 
sensitivity to environmental mutagens. Examples of such disorders include xeroderma pigmentosum, 
Bloom's syndrome, and Werner's syndrome. Xeroderma pigmentosum results in a hypersensitivity to 
25 sunlight, especially ultraviolet, and produces skin defects. Bloom's syndrome results in an increased 
frequency of chromosomal aberrations, including sister chromosome exchanges (Yamagata, K. et al. 
(1998) Proc. NatL Acad. Sci. USA 95:8733-8738). 

Direct repair involves the reversal or simple removal of the damaged region of DNA. 
Mismatches involving normal bases are repaired based on certain biases within the repair system. 
30 For example, mismatched GT base pairs are frequently caused by deamination of 5-methyl-cytosine to 
form thymine. Therefore, repair systems convert mismatched GT pairs to GC, instead of AT. Repair 
also favors the non-methylated strand inhemimethylated DNA, since this strand represents the newly 
synthesized daughter strand. The recognition of hemimethylated DNA and repair of mismatches on 
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the non-methylated strand involve the products of the genes mutH, inutL, mutS (which specifically 
recognizes mismatched base pairs), the helicase encoded by the uvrD gene, and the methylase 
encoded by the dam gene. C-5 cytosine-specific DNA methylases are enzymes that specifically 
methylate the C-5 carbon of cytosines in DNA (Kumar, S. et al (1994) Nucleic Acids Res. 22:1-10). 

5 Excision repair is a system in which mispaired or damaged bases are removed from DNA and 

a new stretch of DNA is synthesized to replace them. In the incision step, the damaged structure is 
recognized by an endonuclease that cleaves the DNA strand on both sides of the damage. In the 
excision step, a 5 -3' exonuclease removes a stretch of the damaged DNA strand. In the synthesis 
step, the resulting single-stranded region serves as a template for a DNA polymerase to synthesize a 

10 replacement for the excised sequence. Finally, DNA ligase co valently links the 3 ' end of the new 
material to the old material. In mammals, DNA polymerase beta serves as the DNA repair 
polymerase. Mutations in the human DNA polymerase beta gene are associated with several types of 
cancer (Bhattacharyya, N. et al. (1999) DNA Cell Biol. 18:549-554; Matsuzaki, J. et al. (1996) Mol. 
Carcinog. 15:38-43). 

15 Methylases 

Methylation of specific nucleotides occurs in both DNA and RNA, and serves different 
functions in the two macromolecules. Methylation of cytosine residues to form 5-methyl cytosine in 
DNA occurs specifically in CG sequences which are base-paired with one another in the DNA 
double-helix. The pattern of methylation is passed from generation to generation during DNA 

20 replication by an enzyme called "maintenance methylase" that acts preferentially on those CG 
sequences that are base-paired with a CG sequence that is already methylated. Such methylation 
appears to distinguish active from inactive genes by preventing the binding of regulatory proteins that 
"turn on" the gene, but permiting the binding of proteins that inactivate the gene (Alberts, supra, pp. 
448-451). In RNA metabolism, "tRNA methylase" produces one of several nucleotide modifications 

25 in tRNA that affect the conformation and base-pairing of the molecule and facilitate the recognition of 
the appropriate mRNA codons by specific tRNAs. The primary methylation pattern is the 
dimethylation of guanine residues to form N ,N-dimethyl guanine. 
Helicases and Single-stranded Binding Proteins 

Helicases are enzymes that destabilize and unwind double helix structures in both DNA and 

30 RNA. Since DNA replication occurs more or less simultaneously on both strands, the two strands 
must first separate to generate a replication "fork" for DNA polymerase to act on. Two types of 
replication proteins contribute to this process, DNA helicases and single-stranded binding proteins. 
DNA helicases hydrolyze ATP and use the energy of hydrolysis to separate the DNA strands. 
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Single-stranded binding proteins (SSBs) then bind to the exposed DNA strands, without covering the 
bases, thereby temporarily stabilizing them for templating by the DNA polymerase (Alberts, supra, pp. 
255-256). 

RNA helicases also alter and regulate RNA conformation and secondary structure. like the 
5 DNA helicases, RNA helicases utilize energy derived from ATP hydrolysis to destabilize and unwind 
RNA duplexes. The most well-characterized and ubiquitous family of RNA helicases is the DEAD- 
box family, so named for the conserved B-type ATP-binding motif which is diagnostic of proteins in 
this family. Over 40 DEAD-box helicases have been identified in organisms as diverse as bacteria, 
insects, yeast, amphibians, mammals, and plants. DEAD-box helicases function in diverse processes 
10 such as translation initiation, splicing, ribosome assembly, and RNA editing, transport, and stability. 
Examples of these RNA helicases include yeast Drsl protein, which is involved in ribosomal RNA 
processing; yeast TEF1 and TTF2 and mammalian eIF-4A, which are essential to the initiation of RNA 
translation; and human p68 antigen, which regulates cell growth and division (Ripmaster, T.L. et aL 
(1992) Proc. Natl Acad. Sci. USA 89:11131-11135; Chang, T.-H. et aL (1990) Proc. Natl. Acad. Sci. 
15 USA 87:1571-1575). These RNA helicases demonstrate strong sequence homology over a stretch of 
some 420 amino acids. Included among these conserved sequences are the consensus sequence for 
the A motif of an ATP binding protein; the <C DEAD box" sequence, associated with ATPase activity; 
the sequence SAT, associated with the actual helicase unwinding region; and an octapeptide 
consensus sequence, required for RNA binding and ATP hydrolysis (Pause, A. et aL (1993) MoL Cell 
20 BioL 13:6789-6798). Differences outside of these conserved regions are believed to reflect 
differences in the functional roles of individual proteins (Gbtang et al., supra). 

Some DEAD-box helicases play tissue- and stage-specific roles in spermatogenesis and 
embryogenesis. Overexpression of the DEAD-box 1 protein (DDX1) may play a role in the 
progression of neuroblastoma (Nb) and retinoblastoma (Rb) tumors (Godbout, R. et aL (1998) J. Biol. 
25 Chem. 273:21161-21168). These observations suggest that DDX1 may promote or enhance tumor 
progression by altering the normal secondary structure and expression levels of RNA in cancer cells. 
Other DEAD-box helicases have been implicated either directly or indirectly in tumorigenesis 
(Godbout et al., supra). For example, murine p68 is mutated in ultraviolet light-induced tumors, and 
human DDX6 is located at a chromosomal breakpoint associated with B-cell lymphoma. Similarly, a 
30 chimeric protein comprised of DDX10 and NUP98, a nucleoporin protein, may be involved in the 
pathogenesis of certain myeloid malignancies. 
Topoisomerases 

Besides the need to separate DNA strands prior to replication, the two strands must be 
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Unwound'* from one another prior to their separation hy DNA helicases. This function is performed 
by proteins known as DNA topoisomerases. DNA topoisomerase effectively acts as a reversible 
nuclease that hydrolyzes a phosphodiesterase bond in a DNA strand, permits the two strands to rotate 
freely about one another to remove the strain of the helix, and then rejoins the original phosphodiester 

5 bond between the two strands. Topoisomerases are essential enzymes responsible for the topological 
rearrangement of DNA brought about by transcription, replication, chromatin formation, 
recombination, and chromosome segregation. Superhelical coils are introduced into DNA by the 
passage of processive enzymes such as RNA polymerase, or by the separation of DNA strands by a 
helicase prior to replication. Knotting and concatenation can occur in the process of DNA synthesis, 

10 storage, and repair. All topoisomerases work by breaking a phosphodiester bond in the ribose- 

phosphate backbone of DNA. A catalytic tyrosine residue on the enzyme makes a nucleophilic attack 
on the scissile phosphodiester bond, resulting in a reaction intermediate in which a covalent bond is 
formed between the enzyme and one end of the broken strand. A tyrosine-DNA phosphodiesterase 
functions in DNA repair by hydrolyzing this bond in occasional dead-end topoisomerase I-DNA 

15 intermediates (Pouliot, J.J. et al. (1999) Science 286:552-555). 

Two types of DNA topoisomerase exist, types I and II. Type I topoisomerases work as 
monomers, making a break in a single strand of DNA while type II topoisomerases, working as 
homodimers, cleave both strands. DNA Topoisomerase I causes a single-strand break in a DNA 
helix to allow the rotation of the two strands of the helix about the remaining phosphodiester bond in 

20 the opposite strand. DNA topoisomerase H causes a transient break in both strands of a DNA helix 
where two double helices cross over one another. This type of topoisomerase can efficiently separate 
two interlocked DNA circles (Alberts, supra, pp. 260-262). Type II topoisomerases are largely 
confined to proliferating cells in eukaryotes, such as cancer cells. For this reason they are targets for 
anticancer drugs. Topoisomerase II has been implicated in multi-drug resistance (MDR) as it appears 

25 to aid in the repair of DNA damage inflicted by DNA binding agents such as doxorubicin and 
vincristine. 

The topoisomerase I family includes topoisomerases I and III (topo I and topo IH). The 
crystal structure of human topoisomerase I suggests that rotation about the intact DNA strand is 
partially controlled by the enzyme. In this "controlled rotation" model, protein-DNA interactions limit 
30 the rotation, which is driven by torsional strain in the DNA (Stewart, L. et al. (1998) Science 

379:1534-1541). Structurally, topo I can be recognized by its catalytic tyrosine residue and a number 
of other conserved residues in the active site region. Topo I is thought to function during transcription. 
Two topo TTTs are known in humans, and they are homologous to prokaryotic topoisomerase I, with a 
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conserved tyrosine and active site signature specific to this family. Topo HI has been suggested to 
play a role in meiotic recombination. A mouse topo III is highly expressed in testis tissue and its 
expression increases with the increase in the number of cells in pachytene (Seki, T. et al. (1998) J. 
BioL Chem. 273:28553-28556). 
5 The topoisomerase II family includes two isozymes (Hoc and 11(3) encoded by different genes. 

Topo II cleaves double stranded DNA in a reproducible, nonrandom fashion, preferentially in an AT 
rich region, but the basis of cleavage site selectivity is not known. Structurally, topo II is made up of 
four domains, the first two of which are structurally similar and probably distantly homologous to 
similar domains in eukaryotic topo L The second domain bears the catalytic tyrosine, as well as a 
10 highly conserved pentapeptide. The Ha isofonn appears to be responsible for unlinking DNA during 
chromosome segregation. Cell lines expressing Ha but not 11(3 suggest that Hp is dispensable in 
cellular processes; however, HP knockout mice died perinataHy due to a failure in neural development. 
That the major abnormalities occurred in predominantly late developmental events (neurogenesis) 
suggests that IIP is needed not at mitosis, but rather during DNA repair (Yang, X. et al. (2000) 
15 Science 287:131-134). 

Topoisomerases have been implicated in a number of disease states, and topoisomerase 
poisons have proven to be effective anti-tumor drugs for some human malignancies. Topo I is 
mislocalized in Fanconi's anemia, and maybe involved in the chromosomal breakage seen in this 
disorder (Wunder, E. (1984) Hum. Genet 68:276-281). Overexpression of a truncated topo III in 
20 ataxia-telangiectasia (A-T) cells partially suppresses the A-T phenotype, probably through a dominant 
negative mechanism. This suggests that topo III is deregulated in A-T (Fritz, E. et al. (1997) Proc. 
Natl. Acad. Sci. USA 94:4538-4542). Topo m also interacts with the Bloom's Syndrome gene 
product, and has been suggested to have a role as a tumor suppressor (Wu, L. et al. (2000) J. Biol. 
Chem. 275:9636-9644). Aberrant topo II activity is often associated with cancer or increased cancer 
25 risk. Greatly lowered topo n activity has been found in some, but not all A-T cell lines (Mohamed, R. 
et al. (1987) Biochem. Biophys. Res. Commun. 149:233-238). On the other hand, topo II can break 
DNA in the region of the A-T gene (ATM), which controls all DNA damage-responsive cell cycle 
checkpoints (Kaufinann, W.K. (1998) Proc. Soc. Exp. Biol. Med. 217:327-334). The ability of 
topoisomerases to break DNA has been used as the basis of antitumor drugs. Topoisomerase poisons 
30 act by increasing the number of dead-end covalent DNA-enzyme complexes in the cell, ultimately 
triggering cell death pathways (Fortune, J.M. and N. Osheroff (2000) Prog. Nucleic Acid Res. MoL 
BioL 64:221-253; Guichard, S.M. and M.K. Danks (1999) Curr. Opin. Oncol 11:482-489). Antibodies 
against topo I are found in the serum of systemic sclerosis patients, and the levels of the antibody may 
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be used as a marker of pulmonary involvement in the disease (Diot, E. et al. (1999) Chest 116:715- 
720). Finally, the DNA binding region of human topo I has been used as a DNA delivery vehicle for 
gene therapy (Chen, T.Y. et aL (2000) Appl. Microbiol. BiotechnoL 53:558-567). 
Recombinases 

5 Genetic recombination is the process of rearranging DNA sequences within an organism's 

genome to provide genetic variation for the organism in response to changes in the environment. 
DNA recombination allows variation in the particular combination of genes present in an individual's 
genome, as well as the timing and level of expression of these genes (Alberts, supra, pp. 263-273). 
Two broad classes of genetic recombination are commonly recognized, general recombination and 

10 site-specific recombination. General recombination involves genetic exchange between any 

homologous pair of DNA sequences usually located on two copies of the same chromosome. The 
process is aided by enzymes, recombinases, that "nick" one strand of a DNA duplex more or less 
randomly and permit exchange with a complementary strand on another duplex. The process does not 
normally change the arrangement of genes in a chromosome. In site-specific recombination, the 

15 recombinase recognizes specific nucleotide sequences present in one or both of the recombining 

molecules. Base-pairing is not involved in this form of recombination and therefore it does not require 
DNA homology between the recombining molecules. Unlike general recombination, this form of 
recombination can alter the relative positions of nucleotide sequences in chromosomes. 
RNA METABOLISM 

20 Much of the regulation of gene expression in eucaryotic cells occurs at the posttranscriptional 

level Messenger RNAs (mRNA), which are produced in the cell nucleus from primary transcripts of 
protein-encoding genes, are processed and transported to the cytoplasm where the protein synthesis 
machinery is located. RNA-binding proteins are a group of proteins that participate in the processing, 
editing, transport, localization, and posttranscriptional regulation of mRNAs, and comprise the protein 

25 component of ribosomes as well. The RNA-binding activity of many of these proteins is mediated by 
a series of RNA-binding motifs identified within them. These domains include the RNP motif, the 
arginine-rich motif, the RGG box, and the KH motif. (Reviewed in Burd, C.G. and Dreyfuss, G. 
(1994) Science 265:615-621.) The RNP motif is the most widely found and best characterized of 
these motifs. The RNP motif is composed of 90-100 amino acids which form an RNA-binding domain 

30 and is found in one or more copies in proteins that bind pre-mRNA, mRNA, pre-ribosomal RNA, and 
small nuclear RNAs. The RNP motif is composed of two short sequences (RNP-1 and RNP-2) and 
a number of other mostly hydrophobic, conserved amino acids interspersed throughout the motif. 
(Burd, supra; ExPASy PROSITE document PDOC0030.) 
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Ribonucleic acid (RNA) is a linear single-stranded polymer of four nucleotides, ATP, CTP, 
UTP, and GTP. In most organisms, RNA is transcribed as a copy of deoxyribonucleic acid (DNA), 
the genetic material of the organism. In retroviruses RNA rather than DNA serves as the genetic 
material. RNA copies of the genetic material encode proteins or serve various structural, catalytic, or 

5 regulatory roles in organisms. RNA is classified according to its cellular localization and function. 
Messenger RNAs (mRNAs) encode polypeptides. Ribosomal RNAs (rRNAs) are assembled, along 
with ribosomal proteins, into ribosomes, which are cytoplasmic particles that translate mRNA into 
polypeptides. Transfer RNAs (tRNAs) are cytosolic adaptor molecules that function in mRNA 
translation by reco gnizing both an mRNA codon and the amino acid that matches that codon. 

10 Heterogeneous nuclear RNAs (hnRNAs) include mRNA precursors and other nuclear RNAs of 
various sizes. Small nuclear RNAs (snRNAs) are a part of the nuclear spliceosome complex that 
removes intervening, non-coding sequences (introns) and rejoins exons in pre-mRNAs. 

Proteins are associated with RNA during its transcription from DNA, RNA processing, and 
translation of mRNA into protein. Proteins are also associated with RNA as it is used for structural, 

15 catalytic, and regulatory purposes. 
RNA Processing 

Ribosomal RNAs (rRNAs) are assembled, along with ribosomal proteins, into ribosomes, 
which are cytoplasmic particles that translate messenger RNA (mRNA) into polypeptides. The 
eukaryotic ribosome is composed of a 60S (large) subunit and a 40S (small) subunit, which together 

20 form the 80S ribosome. In addition to the 18S, 28S, 5S, and 5.8S rRNAs, ribosomes contain from 50 
to over 80 different ribosomal proteins, depending on the organism. Ribosomal proteins are classified 
according to which subunit they belong (i.e., L, if associated with the large 60S large subunit or S if 
associated with the small 40S subunit). E. coli ribosomes have been the most thoroughly studied and 
contain 50 proteins, many of which are conserved in all life forms. The structures of nine ribosomal 

25 proteins have been solved to less than 3.0D resolution (i.e., S5, S6, S17, LI, L6, L9, L12, L14, L30), 
revealing common motifs, such as b-a-b protein folds in addition to acidic and basic RNA-binding 
motifs positioned between b-strands. Most ribosomal proteins are believed to contact rRNA directly 
(reviewed in Liljas, A. and M. Garber (1995) Curr. Opin. Struct Biol. 5:721-727; see also Woodson, 
S.A. and N.B. Leontis (1998) Curr. Opin. Struct Biol. 8:294-300; Ramakrishnan, V. and S.W. White 

30 (1998) Trends Biochem. Sci. 23:208-212). 

Ribosomal proteins may undergo post-translational modifications or interact with other 
ribosome-associated proteins to regulate translation. For example, the highly homologous 40S 
ribosomal protein S6 kinases (S6K1 and S6K2) play a key role in the regulation of cell growth by 
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controlling the biosynthesis of translational components which make up the protein synthetic apparatus 
(including the ribosomal proteins). In the case of S6K1, at least eight phosphorylation sites are 
believed to mediate kinase activation in a hierarchical fashion (Dufher and Thomas (1999) Exp. Cell. 
Res. 253:100-109). Some of the ribosomal proteins, including LI, also function as translational 
5 repressors by binding to polycistronic mRNAs encoding ribosomal proteins (reviewed in Liljas and 
Garber, supra). 

Recent evidence suggests that a number of ribosomal proteins have secondary functions 
independent of their involvement in protein biosynthesis. These proteins function as regulators of cell 
proliferation and, in some instances, as inducers of cell death. For example, the expression of human 

10 ribosomal protein L13a has been shown to induce apoptosis by arresting cell growth in the G2/M 
phase of the cell cycle. Inhibition of expression of L13a induces apoptosis in target cells, which 
suggests that this protein is necessary, in the appropriate amount, for cell survival. Similar results have 
been obtained in yeast where inactivation of yeast homologues of L13a, rp22 and rp23, results in 
severe growth retardation and death. A closely related ribosomal protein, L7, arrests cells in Gl and 

15 also induces apoptosis. Thus, it appears that a subset of ribosomal proteins may function as cell cycle 
checkpoints and compose a new family of cell proliferation regulators. 

Mapping of individual ribosomal proteins on the surface of intact ribosomes is accomplished 
using 3D immunocryoelectroiuiiicroscopy, whereby antibodies raised against specific ribosomal 
proteins are visualized. Progress has been made toward the mapping of LI, L7, and L12 while the 

20 structure of the intact ribosome has been solved to only 20-25D resolution and inconsistencies exist 
among different crude structures (Frank, J. (1997) Curr. Opin. Struct Biol 7:266-272). 

Three distinct sites have been identified on the ribosome. The aminoacyl-tRNA acceptor site 
(A site) receives charged tRNAs (with the exception of the initiator-tRNA). The peptidyl-tRNA site 
(P site) binds the nascent polypeptide as the amino acid from the A site is added to the elongating 

25 chain. Deacylated tRNAs bind in the exit site (E site) prior to their release from the ribosome. (The 
structure of the ribosome is reviewed in Stryer, L. (1995) Biochemistry , W.H. Freeman and Company, 
New York NY, pp. 888-908; Lodish, supra, pp. 119-138; and Lewin, B. (1997) Genes VL Oxford 
University Press, Inc. New York NY). 

Various proteins are necessary for processing of transcribed RNAs in the nucleus. Pre- 

30 mRNA processing steps include capping at the 5* end with methylguanosine, polyadenylating the 3* 
end, and splicing to remove introns. The primary RNA transript from DNA is a faithful copy of the 
gene cont aining both exon and intron sequences, and the latter sequences must be cut out of the RNA 
transcript to produce a mRNA that codes for a protein. This "splicing" of the mRNA sequence takes 
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place in the nucleus with the aid of a large, multicomponent ribonucleoprotein complex known as a 
spliceosome. The spliceosomal complex is comprised of five small nuclear ribonucleoprotein particles 
(snRNPs) designated Ul, U2, U4, U5, and U6. Each snRNP contains a single species of.snRNA and 
about ten proteins. The RNA components of some snRNPs recognize and base-pair with intron 

5 consensus sequences. The protein components mediate spliceosome assembly and the splicing 
reaction. Autoantibodies to snRNP proteins are found in the blood of patients with systemic lupus 
erythematosus (Stryer, supra, p. 863). 

Heterogeneous nuclear ribonucleoproteins (hnRNPs) have been identified that have roles in 
splicing, exporting of the mature RNAs to the cytoplasm, and mRNA translation (Biamonti, G. et ai 

10 (1998) Clin. Exp. Rheumatol. 16:317-326). Some examples of hnRNPs include the yeast proteins 
Hrplp, involved in cleavage and polyadenylation at the 3 * end of the RNA; Cbp80p, involved in 
capping the 5* end of the RNA; and Npl3p, a homolog of mammalian hnRNP Al, involved in export of 
mRNA from the nucleus (Shen, E.C. et aL (1998) Genes Dev. 12:679-691). HnRNPs have been 
shown to be important targets of the autoimmune response in rheumatic diseases (Biamonti et al, 

15 supra). 

Many snRNP and hnRNP proteins are characterized by an RNA recognition motif (RRM) 
(reviewed in Birney, E. et aL (1993) Nucleic Acids Res. 21:5803-5816). The RRM is about 80 amino 
acids in length and forms four P-strands and two a-helices arranged in an a /p sandwich. The RRM 
contains a core RNP-1 octapeptide motif along with surrounding conserved sequences. In addition to 

20 snRNP proteins, examples of RNA-binding proteins which contain the above motifs include 

heteronuclear ribonucleoproteins which stabilize nascent RNA and factors which regulate alternative 
splicing. Alternative splicing factors include developmentally regulated proteins, specific examples of 
which have been identified in lower eukaryotes such as Drosophila melanogaster and 
Caenorhabditis elegans. These proteins play key roles in developmental processes such as pattern 

25 formation and sex determination, respectively (Hodgkin, J. et aL (1994) Development 120:3681-3689). 
The 3' ends of most eukaryote mRNAs are also posttranscriptkmally modified by 
polyadenylation. Polyadenylation proceeds through two enzymaticaUy distinct steps: (i) the 
endonucleolytic cleavage of nascent mRNAs at cw-acting polyadenylation signals in the 
3 -untranslated (non-coding) region and (ii) the addition of a poly(A) tract to the 5* mRNA fragment. 

30 The presence of cw-acting RNA sequences is necessary for both steps. These sequences include 5 - 
AAUAAA-3 ' located 10-30 nucleotides upstream of the cleavage site and a less well-conserved GU- 
or U-rich sequence element located 10-30 nucleotides downstream of the cleavage site. Cleavage 
stimulation factor (CstF), cleavage factor I (CF I), and cleavage factor II (CP II) are involved in the 
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cleavage reaction while cleavage and polyadenylation specificity factor (CPSF) and poly(A) 
polymerase (PAP) are necessary for both cleavage and polyadenylation. An additional enzyme, 
poly(A)-binding protein II (PAB II), promotes poly(A) tract elongation (Riiegsegger, U. et al. (1996) 
J. BioL Chem. 271:6107-6113; and references within). 
5 TRANSLATION 

Correct translation of the genetic code depends upon each amino acid forming a linkage with 
the appropriate transfer RNA (tRNA). The aminoacyl-tRNA synthetases (aaRSs) are essential 
proteins found in all living organisms. The aaRSs are responsible for the activation and correct 
attachment of an amino acid with its cognate tRNA, as the first step in protein biosynthesis. 

10 Prokaryotic organisms have at least twenty different types of aaRSs, one for each different amino 
acid, while eukaryotes usually have two aaRSs, a cytosolic form and a mitochondrial form, for each 
different amino acid. The 20 aaRS enzymes can be divided into two structural classes. Class I 
enzymes add amino acids to the 2* hydroxyl at the 3' end of tRNAs while Class II enzymes add amino 
acids to the 3' hydroxyl at the 3* end of tRNAs. Each class is characterized by a distinctive topology 

15 of the catalytic domain. Class I enzymes contain a catalytic domain based on the nucleotide-binding 
Rossman 'fold'. In particular, a consensus tetrapeptide motif is highly conserved (Prosite Document 
PDOC00161, Aminoacyl-transfer RNA synthetases class-I signature). Class I enzymes are specific 
for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan, 
and valine. Class II enzymes contain a central catalytic domain, which consists of a seven-stranded 

20 antiparallel fi-sheet domain, as well as N- and C- terminal regulatory domains. Class E enzymes are 
separated into two groups based on the heterodimeric or homodimeric structure of the enzyme; the 
latter group is further subdivided by the structure of the N- and C-terminal regulatory domains 
(Hartlein, M. and S. Cusack (1995) J. Mol. Evol. 40:519-530). Class II enzymes are specific for 
alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine. 

25 Certain aaRSs also have editing functions. HeRS, for example, can misactivate valine to form 

Val-tRNA n % but this product is cleared by a hydrolytic activity that destroys the mischarged product 
This editing activity is located within a second catalytic site found in the connective polypeptide 1 
region (CP1), a long insertion sequence within the Rossman fold domain of Class I enzymes 
(Schimmel, P. et al. (1998) FASEB J. 12:1599-1609). AaRSs also play a role in tRNA processing. It 

30 has been shown that mature tRNAs are charged with their respective amino acids in the nucleus 

before export to the cytoplasm, and charging may serve as a quality control mechanism to insure the 
tRNAs are functional (Martinis, S.A. et al. (1999) EMBO J. 18:4591-4596). 

Under optimal conditions, polypeptide synthesis proceeds at a rate of approximately 40 amino 
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acid residues per second. The rate of misincorporation during translation in on the order of 10"* and is 
primarily the result of amino acyl-t-RN As being charged with the incorrect amino acid. Incorrectly 
charged tRNA are toxic to cells as they result in the incorporation of incorrect amino acid residues 
into an elongating polypeptide. The rate of translation is presumed to be a compromise between the 

5 optimal rate of elongation and the need for translation^ fidelity. Mathematical calculations predict that 
10* 4 is indeed the maximum acceptable error rate for protein synthesis in a biological system (reviewed 
in Stryer, supra; and Watson, J. et aL (1987) The Benjamin/Cummings Publishing Co., Inc. Menlo 
Park, CA). A particularly error prone amino acyl-tRN A charging event is the charging of tRNA GIn 
with Gin. A mechanism exits for the correction of this mischarging event which likely has its origins in 

10 evolution. Gin was among the last of the 20 naturally occurring amino acids used in polypeptide 
synthesis to appear in nature. Gram positive eubacteria, cyanobacteria, Archeae, and eukaryotic 
organelles possess a honcanonical pathway for the synthesis of Gln-tRNA Gbl based on the 
transformation of Ghi-tRNA^ (synthesized by Glu-tRNA synthetase, GhiRS) using the enzyme Ghi- 
tRNA Gln amidotransferase (Glu-AdT). The reactions involved in the transamidation pathway are as 

15 follows (Curnow, A.W. et al. (1997) Nucleic Acids Symposium 36:2-4): 

GhiRS 

tRNA ean +Glu + ATP Glu-tRNA 0111 + AMP + PPi 

20 Glu-AdT 

Glu-tRNA 0311 + Gin + ATP Gln-tRNA® 11 + Glu + ADP + P 
A similar enzyme, Asp-tRNA^ amidotransferase, exists in Archaea, which transforms Asp- 
tRNA^ to Asn-tRNA^. Formylase, the enzyme that transforms Met-tRNA^ to fMet-tRNA*** in 
eubacteria, is likely to be a related enzyme. A hydrolytic activity has also been identified that destroys 
25 mischarged Val-tRNA** (Schimmel, P. et aL (1998) FASEB J. 12:1599-1609). One likely scenario 
for the evolution of Glu-AdT in primitive life forms is the absence of a specific ghitaininyl-tRNA 
synthetase (GlnRS), requiring an alternative pathway for the synthesis of Gln-tRNA 01 *. In fact, 
deletion of the Glu-AdT operon in Gram positive bacteria is lethal (Curnow, A.W. et aL (1997) Proc. 
Natl. Acad. Sci. USA 94:11819-11826). The existence of GhiRS activity in other organisms has been 
30 inferred by the high degree of conservation in translation machinery in nature; however, GhiRS has not 
been identified in all organisms, including Homo sapiens. Such an enzyme would be responsible for 
ensuring translational fidelity and reducing the synthesis of defective polypeptides. 

In addition to their function in protein synthesis, specific aminoacyl tRNA synthetases also 
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play roles in cellular fidelity, RNA splicing, RNA trafficking, apoptosis, and transcriptional and 
translational regulation. For example, human tyrosyl-tRNA synthetase can be proteolytically cleaved 
into two fragments with distinct cytokine activities. The carboxy-terminal domain exhibits monocyte 
and leukocyte chemotaxis activity as well as stimulating production of myeloperoxidase, tumor 

5 necrosis f actor-a, and tissue factor. The N-terminal domain binds to the interleukin-8 type A receptor 
and functions as an interleukin-8-like cytokine. Human tyrosyl-tRNA synthetase is secreted from 
apoptotic tumor cells and may accelerate apoptosis (Wakasugi, K., and Schimmel, P. (1999) Science 
284:147-151). Mitochondrial Neurospora crassa TyrRS and S. cerevisiae LeuRS are essential 
factors for certain group I intron splicing activities, and human mitochondrial LeuRS can substitute for 

10 the yeast LeuRS in a yeast null strain. Certain bacterial aaRSs are involved in regulating their own 
transcription or translation (Martinis et al., supra). Several aaRSs are able to synthesize diadenosine 
oligophosphates, a class of signalling molecules with roles in cell proliferation, differentiation, and 
apoptosis (Kisselev, L.L et al. (1998) EEBS Lett 427:157-163; Vartanian, A. et aL (1999) FEBS Lett. 
456:175-180). 

15 Autoantibodies against aminoacyl-tRNAs are generated by patients with autoimmune diseases 

such as rheumatic arthritis, dermatomyositis and polymyositis, and correlate strongly with complicating 
interstitial lung disease (TLD) (Freist, W. et al. (1999) Biol. Chem. 380:623-646; Freist, W. et al. 
(1996) Biol. Chem. Hoppe Seyler 377:343-356). These antibodies appear to be generated in response 
to viral infection, and coxsackie virus has been used to induce experimental viral myositis in animals. 

20 Comparison of aaRS structures between humans and pathogens has been useful in the design 

of novel antibiotics (Schimmel et al., supra). Genetically engineered aaRSs have been utilized to 
allow site-specific incorporation of unnatural amino acids into proteins in vivo (Liu, D.R. et al. (1997) 
Proc. Natl. Acad. Sci. USA 94:10092-10097). 
tRNA Modifications 

25 The modified ribonucleoside, pseudouridine (\y), is present ubiquitously in the anticodon regions 

of transfer RNAs (tRNAs), large and small ribosomal RNAs (rRNAs), and small nuclear RNAs 
(snRNAs). y is the most common of the modified nucleosides (i.e., other than G, A, U, and C) present 
in tRNAs. Only a few yeast tRNAs that are not involved in protein synthesis do not contain \|/ 
(Cortese, R. et aL (1974) J. BioL Chem. 249:1103-1108). The enzyme responsible for the conversion 

30 of uridine to \|/, pseudouridine synthase (pseudouridylate synthase), was first isolated from Salmonella 
typhimurium (Arena, F. et al. (1978) Nucleic Acids Res. 5:4523-4536). The enzyme has since been 
isolated from a number of mammals, including steer and mice (Green, C.J. et al. (1982) J. Biol. Chem. 
257:3045-52; and Chen, J. and J.R. Patton (1999) RNA 5:409-419). tRNA pseudouridine synthases 
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have been the most extensively studied members of the family. They require a thiol donor (e.g., 
cysteine) and a monovalent cation (e.g., ammonia or potassium) for optimal activity. Additional 
cofactors or high energy molecules (e.g., ATP or GTP) are not required (Green et al, supra). Other 
eukaryotic pseudouridine synthases have been identified that appear to be specific for rRNA 
5 (reviewed in Smith, CM. and J. A. Steitz (1997) Cell 89:669-672) and a dual-specificity enzyme has 
been identified that uses both tRNA and rRNA substrates (Wrzesinski, J. et al. (1995) RNA 1: 
437-448). The absence of x\f in the anticodon loop of tRNAs results in reduced growth in both 
bacteria (Singer, C.E. et al. (1972) Nature New Biol. 238:72-74) and yeast (Lecointe, F. (1998) J. 
BioL Chem. 273:1316-1323), although the genetic defect is not lethal 
10 Another ribonucleoside modification that occurs primarily in eukaryotic cells is the conversion 

of guanosine to N 2 JsP-dimethylguanosine (m 2 2 G) at position 26 or 10 at the base of the D-stem of 
cytosolic and mitochondrial tRNAs. This posttranscriptional modification is believed to stabilize tRNA 
structure by preventing the formation of alternative tRNA secondary and tertiary structures. Yeast 
tRNA^ is unusual in that it does not contain this modification. The modification does not occur in 
15 eubacteria, presumably because the structure of tRNAs in these cells and organelles is sequence 
constrained and does not require posttranscriptional modification to prevent the formation of 
alternative structures (Steinberg, S. and R. Cedergren (1995) RNA 1:886-891, and references within). 
The enzyme responsible for the conversion of guanosine to m 2 2 G is a 63 kDa 5-adenosylmethionine 
(SAM)-dependent tRNA N 2 JNP-dimethyl-gu ano sine methyltransferase (also referred to as the TRM1 
20 gene product and herein referred to as TRM) (Edqvist, J. (1995) Biochimie 77:54-61). The enzyme 
localizes to both the nucleus and the mitochondria (Li, J-M. et al. (1989) J. Cell BioL 109:1411-1419). 
Based on studies with TRM from Xenopus laevis, there appears to be a requirement for base pairing 
at positions C11-G24 and G10-C25 immediately preceding the G26 to be modified, with other 
structural features of the tRNA also being required for the proper presentation of the G26 substrate 
25 (Edqvist J. et aL (1992) Nucleic Acids Res. 20:6575-6581). Studies in yeast suggest that cells 

carrying a weak ochre tRNA suppressor (sup3-i) are unable to suppress translation termination in the 
absence of TRM activity, suggesting a role for TRM in modifying the frequency of suppression in 
eukaryotic cells (Niederberger, C. et aL (1999) FEBS Lett. 464:67-70), in addition to the more general 
function of ensuring the proper three-dimensional structures for tRNA. 
30 Translation Initiation 

Initiation of translation can be divided into three stages. The first stage brings an initiator 
transfer RNA (Met-tRNAf) together with the 40S ribosomal subunit to form the 43S preinitiation 
complex. The second stage binds the 43 S preinitiation complex to the mRNA, followed by migration 
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of the complex to the correct AUG initiation codon. The third stage brings the 60S ribosomal subtinit 
to the 40S subunit to generate an 80S ribosome at the inititation codon. Regulation of translation 
primarily involves the first and second stage in the initiation process (Pain, V.M. (1996) Eur. J. 
Biochem. 236:747-771). 

5 Several initiation factors, many of which contain multiple subunits, are involved in bringing an 

initiator tRNA and the 40S ribosomal subunit together. eIF2, a guanine nucleotide binding protein, 
recruits the initiator tRNA to the 40S ribosomal subunit Only when e!F2 is bound to GTP does it 
associate with the initiator tRNA. eIF2B, a guanine nucleotide exchange protein, is responsible for 
converting eIF2 from the GDP-bound inactive form to the GTP-bound active form. Two other 

10 factors, elFIA and eIF3 bind and stabilize the 40S subunit by interacting with the 18S ribosomal RNA 
and specific ribosomal structural proteins. eIF3 is also involved in association of the 40S ribosomal 
subunit with mRNA. The Met-tRNA f , elFIA, eIF3, and 40S ribosomal subunit together make up the 
43S preinitiation complex (Pain, supra). 

Additional factors are required for binding of the 43 S preinitiation complex to an mRNA 

15 molecule, and the process is regulated at several levels. eIF4F is a complex consisting of three 

proteins: eIF4E, eIF4A, and eIF4G. eIF4E recognizes and binds to the mRNA 5 -terminal m 7 GTP 
cap, eIF4A is a bidirectional RNA-dependent helicase, and eBF4G is a scaffolding polypeptide. eIF4G 
has three binding domains. The N-terminal third of eIF4G interacts with eIF4E, the central third 
interacts with eIF4A, and the C-terminal third interacts with eIF3 bound to the 43 S preinitiation 

20 complex. Thus, eBF4G acts as a bridge between the 40S ribosomal subunit and the mRNA (Hentze, 
M.W. (1997) Science 275:500-501). 

The ability of eIF4F to initiate binding of the 43 S preinitiation complex is regulated by 
structural features of the mRNA. The mRNA molecule has an untranslated region (UTR) between 
the 5' cap and the AUG start codon. In some mRNAs this region forms secondary structures that 

25 impede binding of the 43S preinitiation complex. The helicase activity of eIF4A is thought to function 
in removing this secondary structure to facilitate binding of the 43S preinitiation complex (Pain, 
supra). 

Translation Elongation 

Elongation is the process whereby additional amino acids are joined to the initiator methionine 
30 to form the complete polypeptide chain. The elongation factors EFla, EFlpy, and EF2 are involved 
in elongating the polypeptide chain following initiation. EFla is a GTP-binding protein. InEFla's 
GTP-bound form, it brings an aminoacyl-tRNA to the ribosome's A site. The amino acid attached to 
the newly arrived aminoacyl-tRNA forms a peptide bond with the initiatior methionine. The GTP on 
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EFla is hydrolyzed to GDP, and EFla-GDP dissociates from the ribosome. EFiPy binds EFla-GDP 
and induces the dissociation of GDP from EFla, allowing EFla to bind GTP and a new cycle to begin. 

As subsequent aminoacyl-tRNAs are brought to the ribosome, EF-G, another GTP-binding 
protein, catalyzes the translocation of tRNAs from the A site to the P site and finally to the E site of 
5 the ribosome. This allows the ribosome and the mRNA to remain attached during translation. 
Translation Termination 

The release factor eRF carries out termination of translation. eRF recognizes stop codons in 
the mRNA, leading to the release of the polypeptide chain from the ribosome. 
Expression profiling 

10 Microarrays are analytical tools used in bioanalysis. A microarray has a plurality of molecules 

spatially distributed over, and stably associated with, the surface of a solid support Microarrays of 
polypeptides, polynucleotides, and/or antibodies have been developed and find use in a variety of 
applications, such as gene sequencing, monitoring gene expression, gene mapping, bacterial 
identification, drug discovery, and combinatorial chemistry. 

15 One area in particular in which microarrays find use is in gene expression analysis. Array 

technology can provide a simple way to explore the expression of a single polymorphic gene or the 
expression profile of a large number of related or unrelated genes. When the expression of a single 
gene is examined, arrays are employed to detect the expression of a specific gene or its variants. 
When an expression profile is examined, arrays provide a platform for identifying genes that are tissue 

20 specific, are affected by a substance being tested in a toxicology assay, are part of a signaling 
cascade, carry out housekeeping functions, or are specifically related to a particular genetic 
predisposition, condition, disease, or disorder. 
Breast Cancer 

There are more than 180,000 new cases of breast cancer diagnosed each year, and the 
25 mortality rate for breast cancer approaches 10% of all deaths in females between the ages of 45-54 
(K. Gish (1999) A WIS Magazine 28:7-10). However the survival rate based on early diagnosis of 
localized breast cancer is extremely high (97%), compared with the advanced stage of the disease in 
which the tumor has spread beyond the breast (22%). Current procedures for clinical breast 
examination are lacking in sensitivity and specificity, and efforts are underway to develop 
30 comprehensive gene expression profiles for breast cancer that may be used in conjunction with 

conventional screening methods to improve diagnosis and prognosis of this disease (Perou, C.M. et aL 
(2000) Nature 406:747-752). 

Mutations in two genes, BRCA1 and BRCA2, are known to greatly predispose a woman to 

25 



BNSDOCtD: <WO O30OO864A2_l_> 



WO 03/000864 PCT/US02/21179 

breast cancer and may be passed on from parents to children (Gish, K. (1999) AWIS Magazine 28:7- 
10). However, this type of hereditary breast cancer accounts for only about 5% to 9% of breast 
cancers, while the vast majority of breast cancer is due to non-inherited mutations that occur in breast 
epithelial cells. 

5 The relationship between expression of epidermal growth factor (EGF) and its receptor, 

EGFR, to human mammary carcinoma has been particularly well studied (see Khazaie, K. et al. 
(1993) Cancer and Metastasis Rev. 12:255-274, and references cited therein for a review of this 
area). Overexpression of EGFR, particularly coupled with down-regulation of the estrogen receptor, 
is a marker of poor prognosis in breast cancer patients. In addition, EGFR expression in breast tumor 

10 metastases is frequently elevated relative to the primary tumor, suggesting that EGFR is involved in 
tumor progression and metastasis. This is supported by accumulating evidence that EGF has effects 
on cell functions related to metastatic potential, such as cell motility, chemotaxis, secretion and 
differentiation. Changes in expression of other members of the erbB receptor family, of which EGFR 
is one, have also been implicated in breast cancer. The abundance of erbB receptors, such as HER- 

15 2/neu, HER-3 , and HER-4, and their ligands in breast cancer points to their functional importance in 
the pathogenesis of the disease, and may therefore provide targets for therapy of the disease (Bacus, 
S. S. et al. (1994) Am. J. Clin. Pathol. 102:S13-S24). Other known markers of breast cancer include 
a human secreted frizzled protein mRNA that is downregulated in breast tumors; the matrix Gla 
protein which is overexpressed is human breast carcinoma cells; Drgl or RTP, a gene whose 

20 expression is diminished in colon, breast, and prostate tumors; maspin, a tumor suppressor gene 

downregulated in invasive breast carcinomas; and CaN19, a member of the S 100. protein family, all of 
which are down regulated in mammary carcinoma cells relative to normal mammary epithelial cells 
(Zhou, Z. et al. (1998) Int. J. Cancer 78:95-99; Chen, L. et aL (1990) Oncogene 5:1391-1395; Ulrix, 
W. et al (1999) FEBS Lett. 455:23-26; Sager, R. et al. (1996) Cuir. Top. Microbiol Immunol. 213:51- 

25 64; and Lee, S. W. et al. (1992) Proc. Natl. Acad. Sci. USA 89:2504-2508). 

Cell lines derived from human mammary epithelial cells at various stages of breast cancer 
provide a useful model to study the process of malignant transformation and tumor progression as it 
has been shown that these cell lines retain many of the properties of their parental tumors for lengthy 
culture periods (Wistuba, I.I. et aL (1998) Clin. Cancer Res. 4:2931-2938). Such a model is 

30 particularly useful for comparing phenotypic and molecular characteristics of human mammary 
epithelial cells at various stages of malignant transformation. 

The immune system responds to infection or trauma by activating a cascade of events that 
coordinate the progressive selection, amplification, and mobilization of cellular defense mechanisms. 
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A complex and balanced program of gene activation and repression is involved in this process. 
However, hyperactivity of the immune system as a result of improper or insufficient regulation of gene 
expression may result in considerable tissue or organ damage. This damage is well documented in 
immunological responses associated with arthritis, allergens, heart attack, stroke, and infections 
5 fHamson' s Principles of Internal Medicine . 13/e, McGraw Hill, Inc. and Teton Data Systems 

Software, 1996). In particular, a zinc finger protein termed StafSO (for Stimulated trans-acting factor 
of 50 kDa) is a transcriptional regulator and is induced in various cell lines hy interferon-I and -II. 
StafSO appears to mediate the antiviral activity of interferon hy down-regulating the viral transcription 
directed by the long terminal repeat promoter region of human immunodeficiency virus type-1 in 
10 transfected cells (Tissot, C (1995) J. Biol Chem. 270:14891-14898). 

Dendritic cells (DC) are antigen presenting cells (APC) that play a key role in the primary 
immune response because of their unique ability to present antigens to naive T-cells. In addition, DC 
differentiate into separate subsets of mature immune cells that sustain and regulate immune responses 
following initial contact with antigen. DC subsets include those that preferentially induce particular T 
15 helper 1 (Thl) or T helper 2 (Th2) responses and those that regulate B cell responses. Moreover, DC 
are being used with increasing frequency to manipulate immune responses, either to downregulate 
aberrant autoimmune response or to enhance vaccination or tumor-specific response. 

DC are functionally specialized in correlation with their particular differentiation state. CD34+ 
myeloid cells found in the bone marrow mature in response to signals into CD14+ CD1 lc+ monocytes. 
20 An innate or antigen non-specific response takes place initially when monocytes circulate to 

nonlymphoid tissues and respond to lipopolysaccharide (LPS), a bacterially-derived mitogen, and 
viruses. Such direct encounters with antigen cause secretion of pro-inflammatory cytokines that 
attract and regulate natural killer cells, macrophages, and eosinophils in the first line of defense against 
invading pathogens. Monocytes then mature into DC, which efficiently capture antigen through 
25 endocytosis and antigen-receptor uptake. Antigen processing and presentation trigger activation and 
differentiation into mature DC that express MHC class II molecules on the cell surface and efficiently 
activate T-cells, initiating antigen-specific T-cell and B-cell responses. In turn, T-cells activate DC 
through CD40 ligand - CD40 interactions, which stimulate expression of the costimulatory molecules 
CD80 and CD86, the latter most potent in amplifying T-ceD responses. DC interaction via CD40 with 
30 T cells also stimulates the production of inflammatory cytokines such as TNF alpha and IL-1. 
Engagement of RANK, a member of the TNF receptor family by its ligand, TRANCE, which is 
expressed on activated T cells, enhances the survival of DC through inhibition of apoptosis, thereby 
enhancing T cell activation. The maturation and differentiation of monocytes into mature DC links the 
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antigen non-specific innate immune response to the antigen-specific adaptive immune response. 

Human peripheral blood mononuclear cells (PBMCs) can be classified into discrete cellular 
populations representing the major components of the immune system. PBMCs contain about 52% 
lymphocytes (12% B lymphocytes, 40% T lymphocytes {25% CD4+ and 15% CD8+}), 20% NK 
5 cells, 25% monocytes, and 3% various cells that include dendritic cells and progenitor cells. The 
proportions, as well as the biology of these cellular components tend to vary slightly between healthy 
individuals, depending on factors such as age, past medical history, and genetic backgrounds. 
Steroid Hormones 

Steroids are a class of lipid-soluble molecules, including cholesterol, bile acids, vitamin D, and 

10 hormones, that share a common four-ring structure based on cyclopentanoperhydrophenanthrene and 
that carrry out a wide variety of functions. Cholesterol, for example, is a component of cell 
membranes that controls membrane fluidity. It is also a precursor for bile acids which solubilize lipids 
and facilitate absorption in the small intestine during digestion. Vitamin D regulates the absorption of 
calcium in the small intestine and controls the concentration of calcium in plasma. Steroid hormones, 

15 produced by the adrenal cortex, ovaries, and testes, include glucocorticoids, mineralocorticoids, 
androgens, and estrogens. They control various biological processes by binding to intracellular 
receptors that regulate transcription of specific genes in the nucleus. Glucocorticoids, for example, 
increase blood glucose concentrations by regulation of gluconeogenesis in the liver, increase blood 
concentrations of fatty acids by promoting lipolysis in adipose tissues, modulate sensitivity to 

20 catcholamines in the central nervous system, and reduce inflammation. The principal 

mineralocorticoid, aldosterone, is produced by the adrenal cortex and acts on cells of the distal tubules 
of the kidney to enhance sodium ion reabsorption. Androgens, produced by the interstitial cells of 
Leydig in the testis, include the male sex hormone testosterone, which triggers changes at puberty, the 
production of sperm and maintenance of secondary sexual characteristics. Female sex hormones, 

25 estrogen and progesterone, are produced by the ovaries and also by the placenta and adrenal cortex of 
the fetus during pregnancy. Estrogen regulates female reproductive processes and secondary sexual 
characteristics. Progesterone regulates changes in the endometrium during the menstrual cycle and 
pregnancy. 

Steroid hormones are widely used for fertility control and in anti-inflammatory treatments for 
30 physical injuries and diseases such as arthritis, asthma, and auto-immune disorders. Progesterone, a 
naturally occurring progestin, is primarily used to treat amenorrhea, abnormal uterine bleeding, or as a 
contraceptive. Endogenous progesterone is responsible for inducing secretory activity in the 
endometrium of the estrogen-primed uterus in preparation for the implantation of a fertilized egg and 
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for the maintenance of pregnancy. It is secreted from the corpus luteum in response to luteinizing 
hormone (LH). The primary contraceptive effect of exogenous progestins involves the suppression 
of the midcycle surge of LEL At the cellular level, progestins diffuse freely into target cells and bind 
to the progesterone receptor. Target cells include the female reproductive tract, the mammary gland, 
5 the hypothalamus, and the pituitary. Once bound to the receptor, progestins slow the frequency of 
release of gonadotropin releasing hormone from the hypothalamus and blunt the pre-ovulatory LH 
surge, thereby preventing follicular maturation and ovulation. Progesterone has minimal estrogenic 
and androgenic activity. Progesterone is metabolized hepatically to pregnanediol and conjugated with 
glucuronic acid. 

10 Medroxyprogesterone (MAH), also known as 6a-methyl-17-hydroxyprogesterone, is a 

synthetic progestin with a pharmacological activity about 15 times greater than progesterone. MAH is 
used for the treatment of renal and endometrial carcinomas, amenorrhea, abnormal uterine bleeding, 
and endometriosis associated with hormonal imbalance. MAH has a stimulatory effect on respiratory 
centers and has been used in cases of low blood oxygenation caused by sleep apnea, chronic 

15 obstructive pulmonary disease, or hypercapnia. 

Mifepristone, also known as RU-486, is an antiprogesterone drug that blocks receptors of 
progesterone. It counteracts the effects of progesterone, which is needed to sustain pregnancy. 
Mifepristone induces spontaneous abortion when administered in early pregnancy followed by 
treatment with the prostaglandin, misoprostol Further, studies show that mifepristone at a 

20 substantially lower dose can be highly effective as a postcoital contraceptive when administered within 
five days after unprotected intercourse, thus providing women with a "morning-after pill" in case of 
contraceptive failure or sexual assault Mifepristone also has potential uses in the treatment of breast 
and ovarian cancers in cases in which tumors are progesterone-dependent. It interferes with steroid- 
dependent growth of brain meningiomas, and may be useful in treatment of endometriosis where it 

25 blocks the estrogen-dependent growth of endometrial tissues. It may also be useful in treatment of 
uterine fibroid tumors and Cushing's Syndrome. Mifepristone binds to glucocorticoid receptors and 
interferes with Cortisol binding. Mifepristone also may act as an anti-glucocorticoid and be effective 
for treating conditions where Cortisol levels are elevated such as AIDS, anorexia nervosa, ulcers, 
diabetes, Parkinson's disease, multiple sclerosis, and Alzheimer's disease. 

30 Danazol is a synthetic steroid derived from ethinyl testosterone. Danazol indirectly reduces 

estrogen production by lowering pituitary synthesis of foUicle-stimulating hormone and LEL Danazol 
also binds to sex hormone receptors in target tissues, thereby exhibiting anabolic, antiestrognic, and 
weakly androgenic activity. Danazol does not possess any progestogenic activity, and does not 
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suppress normal pituitary release of corticotropin or release of Cortisol by the adrenal glands. Danazol 
is used in the treatment of endometriosis to relieve pain and inhibit endometrial cell growth. It is also 
used to treat fibrocystic breast disease and hereditary angioedema. 

Corticosteroids are used to relieve inflammation and to suppress the immune response. They 

5 inhibit eosinophil, basophil, and airway epithelial cell function by regulation of cytokines that mediate 
the inflammatory response. They inhibit leukocyte infiltration at the site of inflammation, interfere in 
the function of mediators of the inflammatory response, and suppress the humoral immune response. 
Corticosteroids are used to treat allergies, asthma, arthritis, and skin conditions. Beclomethasone is a 
synthetic glucocorticoid that is used to treat steroid-dependent asthma, to relieve symptoms associated 

10 with allergic or nonallergic (vasomotor) rhinitis, or to prevent recurrent nasal polyps following surgical 
removal. The anti-in flamma tory and vasoconstrictive effects of intranasal beclomethasone are 5000 
times greater than those produced by hydrocortisone. Budesonide is a corticosteroid used to control 
symptoms associated with allergic rhinitis or asthma. Budesonide has high topical anti-inflammatory 
activity but low systemic activity. Dexamethasone is a synthetic glucocorticoid used in anti- 

15 inflammatory or immunosuppressive compositions. It is also used in inhalants to prevent symptoms of 
asthma. Due to its greater ability to reach the central nervous system, dexamethasone is usually the 
treatment of choice to control cerebral edema. Dexamethasone is approximately 20-30 times more 
potent than hydrocortisone and 5-7 times more potent than prednisone. Prednisone is metabolized in 
the liver to its active form, prednisolone, a glucocorticoid with anti-inflammatory properties. 

20 Prednisone is approximately 4 times more potent than hydrocortisone and the duration of action of 
prednisone is intermediate between hydrocortisone and dexamethasone. Prednisone is used to treat 
allograft rejection, asthma, systemic lupus erythematosus, arthritis, ulcerative colitis, and other 
inflammatory conditions. Betamethasone is a synthetic glucocorticoid with antiinflammatory and 
immunosuppressive activity and is used to treat psoriasis and fungal infections, such as athlete's foot 

25 and ringworm. 

The anti-inflammatory actions of corticosteroids are thought to involve phospholipase A 2 
inhibitory proteins, collectively called lipocortins. Iipocortins, in turn, control the biosynthesis of potent 
mediators of inflammation such as prostaglandins and leukotrienes by inhibiting the release of the 
precursor molecule arachidonic acid. Proposed mechanisms of action include decreased IgE 
30 synthesis, increased number of (3-adrenergic receptors on leukocytes, and decreased arachidonic acid 
metabolism. During an immediate allergic reaction, such as in chronic bronchial asthma, allergens 
bridge the IgE antibodies on the surface of mast cells, which triggers these cells to release 
chemotactic substances. Mast cell influx and activation, therefore, is partially responsible for the 
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inflammation and hyperirritability of the oral mucosa in asthmatic patients. This inflammation can he 
retarded by administration of corticosteroids. 

There is a need in the art for new compositions, including nucleic acids and proteins, for the 
5 diagnosis, prevention, and treatment of cell proliferative, neurological, reproductive, developmental, 
autoimmune/inflammatory, and DNA repair disorders, and infections. 

SUMMARY OF THE INVENTION 

10 Various embodiments of the invention provide purified polypeptides, nucleic acid-associated 

proteins, referred to collectively as "NAAF' and individually as "NAAP-1," "NAAP-2," "NAAP-3," 
"NAAP-4," 'TNTAAP-5," "NAAP-6," "NAAP-7," 'TNTAAP-8," "NAAP-9," "NAAP-10," "NAAP- 
11," "NAAP-12," "NAAP-13," "NAAP-14," "NAAP-15," "NAAP-16," "NAAP-17, "NAAP-18," 
"NAAP-19," "NAAP-20," "NAAP-21," <c NAAP-22," "NAAP-23," "NAAP-24," "NAAP-25," 

15 'TSfAAP-26," "NAAP-27," "NAAP-28," "NAAP-29," "NAAP-30," "NAAP-3 1 "NAAP-32," 

'TSFAAP-33," "NAAP-34 " "NAAP-35," and "NAAP-36," and methods for using these proteins and 
their encoding polynucleotides for the detection, diagnosis, and treatment of diseases and medical 
conditions. Embodiments also provide methods for utilizing the purified nucleic acid-associated 
proteins and/or their encoding polynucleotides for facilitating the drug discovery process, including 

20 determination of efficacy, dosage, toxicity, and pharmacology. Related embodiments provide methods 
for utilizing the purified nucleic acid-associated proteins and/or their encoding polynucleotides for 
investigating the pathogenesis of diseases and medical conditions. 

An embodiment provides an isolated polypeptide selected from the group consisting of a) a 
polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:l- 

25 36, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical or at 
least about 90% identical to an amino acid sequence selected from the group consisting of SEQ ID 
NO:l-36, c) a biologically active fragment of a polypeptide having an amino acid sequence selected 
from the group consisting of SEQ ID NO:l-36, and d) an immunogenic fragment of a polypeptide 
having an amino acid sequence selected from the group consisting of SEQ ID NO:l-36. Another 

30 embodiment provides an isolated polypeptide comprising an amino acid sequence of SEQ ID NO: 1-3 6. 

Still another embodiment provides an isolated polynucleotide encoding a polypeptide selected 
from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the 
group consisting of SEQ ID NO: 1-36, b) a polypeptide comprising a naturally occurring amino acid 
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sequence at least 90% identical or at least about 90% identical to an amino acid sequence selected 
from the group consisting of SEQ ID NO:l-36, c) a biologically active fragment of a polypeptide 
having an amino acid sequence selected from the group consisting of SEQ ID NO:l-36, and d) an 
immunogenic fragment of a polypeptide having an amino acid sequence selected from the group 

5 consisting of SEQ ID NO:l-36. In another embodiment, the polynucleotide encodes a polypeptide 
selected from the group consisting of SEQ ID NO:l-36. In an alternative embodiment, the 
polynucleotide is selected from the group consisting of SEQ ID NO:37-72. 

Still another embodiment provides a recombinant polynucleotide comprising a promoter 
sequence operably linked to a polynucleotide encoding a polypeptide selected from the group 

10 consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting 
of SEQ ID NO:l-36, b) a polypeptide comprising a naturally occurring amino acid sequence at least 
90% identical or at least about 90% identical to an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-36, c) a biologically active fragment of a polypeptide having an amino acid 
sequence selected from the group consisting of SEQ ID NO: 1-3 6, and d) an immunogenic fragment of 

15 a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:l-36. 
Another embodiment provides a cell transformed with the recombinant polynucleotide. Yet another 
embodiment provides a transgenic organism comprising the recombinant polynucleotide. 

Another embodiment provides a method for producing a polypeptide selected from the group 
consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting 

20 of SEQ ID NO: 1-3 6, b) a polypeptide comprising a naturally occurring amino acid sequence at least 
90% identical or at least about 90% identical to an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-3 6, c) a biologically active fragment of a polypeptide having an amino acid 
sequence selected from the group consisting of SEQ ID NO:l-36, and d) an immunogenic fragment of 
a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:l-36. 

25 The method comprises a) culturing a cell under conditions suitable for expression of the polypeptide, 
wherein said cell is transformed with a recombinant polynucleotide comprising a promoter sequence 
operably linked to a polynucleotide encoding the polypeptide, and b) recovering the polypeptide so 
expressed. 

Yet another embodiment provides an isolated antibody which specifically binds to a 
30 polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-3 6, b) a polypeptide comprising a naturally 
occurring amino acid sequence at least 90% identical or at least about 90% identical to an amino acid 
sequence selected from the group consisting of SEQ ID NO:l-36, c) a biologically active fragment of 
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a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 1-36, 
and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the 
group consisting of SEQ ID NO: 1-3 6. 

StLD yet another embodiment provides an isolated polynucleotide selected from the group 
consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group 
consisting of SEQ K> NO:37-72, b) a polynucleotide comprising a naturally occurring polynucleotide 
sequence at least 90% identical or at least about 90% identical to a polynucleotide sequence selected 
from the group consisting of SEQ ID NO:37-72, c) a polynucleotide complementary to the 
polynucleotide of a), d) a polynucleotide complementary to the polynucleotide of b), and e) an RNA 
equivalent of a)-d). In other embodiments, the polynucleotide can comprise at least about 20, 30, 40, 
60, 80, or 100 contiguous nucleotides. 

Yet another embodiment provides a method for detecting a target polynucleotide in a sample, 
said target polynucleotide being selected from the group consisting of a) a polynucleotide comprising a 
polynucleotide sequence selected from the group consisting of SEQ ID NO:37-72, b) a polynucleotide 
comprising a naturally occurring polynucleotide sequence at least 90% identical or at least about 90% 
identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO:37-72, c) a 
polynucleotide complementary to the polynucleotide of a), d) a polynucleotide complementary to the 
polynucleotide of b), and e) an RNA equivalent of a)-d). The method comprises a) hybridizing the 
sample with a probe comprising at least 20 contiguous nucleotides comprising a sequence 
complementary to said target polynucleotide in the sample, and which probe specifically hybridizes to 
said target polynucleotide, under conditions whereby a hybridization complex is formed between said 
probe and said target polynucleotide or fragments thereof, and b) detecting the presence or absence of 
said hybridization complex. In a related embodiment, the method can include detecting the amount of 
the hybridization complex. In still other embodiments, the probe can comprise at least about 20, 30, 
40, 60, 80, or 100 contiguous nucleotides. 

Still yet another embodiment provides a method for detecting a target polynucleotide in a 
sample, said target polynucleotide being selected from the group consisting of a) a polynucleotide 
comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO:37-72, b) a 
polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical or at 
least about 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID 
NO:37-72, c) a polynucleotide complementary to the polynucleotide of a), d) a polynucleotide 
complementary to the polynucleotide of b), and e) an RNA equivalent of a)-d). The method 
comprises a) amplifying said target polynucleotide or fragment thereof using polymerase chain 
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reaction amplification, and b) detecting the presence or absence of said amplified target polynucleotide 
or fragment thereof. In a related embodiment, the method can include detecting the amount of the 
amplified target polynucleotide or fragment thereof. 

Another embodiment provides a composition comprising an effective amount of a polypeptide 

5 selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected 
from the group consisting of SEQ ID NO: 1-3 6, b) a polypeptide comprising a naturally occurring 
amino acid sequence at least 90% identical or at least about 90% identical to an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-36, c) a biologically active fragment of a 
polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:l-36, 

10 and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the 
group consisting of SEQ ID NO:l-36, and a phannaceutically acceptable excipient. In one 
embodiment, the composition can comprise an amino acid sequence selected from the group consisting 
of SEQ ID NO:l-36. Other embodiments provide a method of treating a disease or condition 
associated with decreased or abnormal expression of functional NAAP, comprising administering to a 

15 patient in need of such treatment the composition. 

Yet another embodiment provides a method for screening a compound for effectiveness as an 
agonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino 
acid sequence selected from the group consisting of SEQ ID NO:l-36, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an 

20 amino acid sequence selected from the group consisting of SEQ ID NO:l-36, c) a biologically active 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
ID NO:l-36, and d) an immunogenic fragment of a polypeptide having an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-36. The method comprises a) exposing a sample 
comprising the polypeptide to a compound, and b) detecting agonist activity in the sample. Another 

25 embodiment provides a composition comprising an agonist compound identified by the method and a 
pharmaceutical^ acceptable excipient. Yet another embodiment provides a method of treating a 
disease or condition associated with decreased expression of functional NAAP, comprising 
administering to a patient in need of such treatment the composition. 

Still yet another embodiment provides a method for screening a compound for effectiveness 

30 as an antagonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an 
amino acid sequence selected from the group consisting of SEQ ID NO:l-36, b) a polypeptide 
comprising a naturally occurring amino acid sequence at least 90% identical or at least about 90% 
identical to an amino acid sequence selected from the group consisting of SEQ ID NO:l-36, c) a 
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biologically active fragment of a polypeptide having an amino acid sequence selected from the group 
consisting of SEQ U> NO:l-36, and d) an immunogenic fragment of a polypeptide having an amino 
acid sequence selected from the group consisting of SEQ ID NO:l-36. The method comprises a) 
exposing a sample comprising the polypeptide to a compound, and b) detecting antagonist activity in 
the sample. Another embodiment provides a composition comprising an antagonist compound 
identified by the method and a pharmaceutically acceptable excipient Yet another embodiment 
provides a method of treating a disease or condition associated with overexpression of functional 
NAAP, comprising administering to a patient in need of such treatment the composition. 

Another embodiment provides a method of screening for a compound that specifically binds to 
a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid 
sequence selected from the group consisting of SEQ ID NO:l-36, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an 
amino acid sequence selected from the group consisting of SEQ ID NO: 1-36, c) a biologically active 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
ID NO:l-36, and d) an immunogenic fragment of a polypeptide having an amino acid sequence 
selected from the group consisting of SEQ ID NO:l-36. The method comprises a) combining the 
polypeptide with at least one test compound under suitable conditions, and b) detecting binding of the 
polypeptide to the test compound, thereby identifying a compound that specifically binds to the 
polypeptide. 

Yet another embodiment provides a method of screening for a compound that modulates the 
activity of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino 
acid sequence selected from the group consisting of SEQ ID NO:l-36, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an 
amino acid sequence selected from the group consisting of SEQ ID NO:l-36, c) a biologically active 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
ID NO:l-36, and d) an immunogenic fragment of a polypeptide having an amino acid sequence 
selected from the group consisting of SEQ ID NO:l-36. The method comprises a) combining the 
polypeptide with at least one test compound under conditions permissive for the activity of the 
polypeptide, b) assessing the activity of the polypeptide in the presence of the test compound, and c) 
comparing the activity of the polypeptide in the presence of the test compound with the activity of the 
polypeptide in the absence of the test compound, wherein a change in the activity of the polypeptide in 
the presence of the test compound is indicative of a compound that modulates the activity of the 
polypeptide. 
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Still yet another embodiment provides a method for screening a compound for effectiveness in 
altering expression of a target polynucleotide, wherein said target polynucleotide comprises a 
polynucleotide sequence selected from the group consisting of SEQ ID NO:37-72, the method 
comprising a) exposing a sample comprising the target polynucleotide to a compound, b) detecting 
altered expression of the target polynucleotide, and c) comparing the expression of the target 
polynucleotide in the presence of varying amounts of the compound and in the absence of the 
compound. 

Another embodiment provides a method for assessing toxicity of a test compound, said 
method comprising a) treating a biological sample containing nucleic acids with the test compound; b) 
hybridizing the nucleic acids of the treated biological sample with a probe comprising at least 20 
contiguous nucleotides of a polynucleotide selected from the group consisting of i) a polynucleotide 
comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO:37~72, ii) a 
polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical or at 
least about 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID 
NO:37-72, iii) a polynucleotide having a sequence complementary to i), iv) a polynucleotide 
complementary to the polynucleotide of ii), and v) an RNA equivalent of i)-iv). Hybridization occurs 
under conditions whereby a specific hybridization complex is formed between said probe and a target 
polynucleotide in the biological sample, said target polynucleotide selected from the group consisting of 
i) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ 
ID NO:37-72, ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
90% identical or at least about 90% identical to a polynucleotide sequence selected from the group 
consisting of SEQ ED NO:37-72, iii) a polynucleotide complementary to the polynucleotide of i), iv) a 
polynucleotide complementary to the polynucleotide of ii), and v) an RNA equivalent of i)-iv). 
Alternatively, the target polynucleotide can comprise a fragment of a polynucleotide selected from the 
group consisting of i)-v) above; c) quantifying the amount of hybridization complex; and d) comparing 
the amount of hybridization complex in the treated biological sample with the amount of hybridization 
complex in an untreated biological sample, wherein a difference in the amount of hybridization 
complex in the treated biological sample is indicative of toxicity of the test compound. 

BRIEF DESCRIPTION OF THE TABLES 

Table 1 summarizes the nomenclature for full length polynucleotide and polypeptide 
embodiments of the invention. 

Table 2 shows the GenBank identification number and annotation of the nearest GenBank 
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homolog, and the PROTEOME database identification numbers and annotations of PROTEOME 
database homologs, for polypeptide embodiments of the invention. The probability scores for the 
matches between each polypeptide and its bomolog(s) are also shown. 

Table 3 shows structural features of polypeptide embodiments, including predicted motifs and 
5 domains, along with the methods, algorithms, and searchable databases used for analysis of the 
polypeptides. 

Table 4 lists the cDNA and/or genomic DNA fragments which were used to assemble 
polynucleotide embodiments, along with selected fragments of the polynucleotides. 

Table 5 shows representative cDNA libraries for polynucleotide embodiments. 
Table 6 provides an appendix which describes the tissues and vectors used for construction of 
the cDNA libraries shown in Table 5. 

Table 7 shows the tools, programs, and algorithms used to analyze polynucleotides and 
polypeptides, along with applicable descriptions, references, and threshold parameters. 

Table 8 shows single nucleotide polymorphisms found in polynucleotide embodiments, along 
with allele frequencies in different human populations. 

DESCRIPTION OF THE INVENTION 

Before the present proteins, nucleic acids, and methods are described, it is understood that 
embodiments of the invention are not limited to the particular machines, instruments, materials, and 
methods described, as these may vary. It is also to be understood that the terminology used herein is 
for the purpose of describing particular embodiments only, and is not intended to limit the scope of the 
invention. 

As used herein and in the appended claims, the singular forms "a," "an," and "the" include 
plural reference unless the context clearly dictates otherwise. Thus, for example, a reference to "a 
host cell" includes a plurality of such host cells, and a reference to "an antibody" is a reference to one 
or more antibodies and equivalents thereof known to those skilled in the art, and so forth. 

Unless defined otherwise, all technical and scientific terms used herein have the same 
meanings as commonly understood by one of ordinary skill in the art to which this invention belongs. 
Although any machines, materials, and methods similar or equivalent to those described herein can be 
used to practice or test the present invention, the preferred machines, materials and methods are now 
described. All publications mentioned herein are cited for the purpose of describing and disclosing the 
cell lines, protocols, reagents and vectors which are reported in the publications and which might be 
used in connection with various embodiments of the invention. Nothing herein is to be construed as an 
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admission that the invention is not entitled to antedate such disclosure by virtue of prior invention. 
DEFINITIONS 

"NAAP" refers to the amino acid sequences of substantially purified NAAP obtained from 
any species, particularly a mammalian species, including bovine, ovine, porcine, murine, equine, and 

5 human, and from any source, whether natural, synthetic, semi-synthetic, or recombinant 

The term "agonist" refers to a molecule which intensifies or mimics the biological activity of 
NAAP. Agonists may include proteins, nucleic acids, carbohydrates, small molecules, or any other 
compound or composition which modulates the activity of NAAP either by directly interacting with 
NAAP or by acting on components of the biological pathway in which NAAP participates. 

10 An "allelic variant" is an alternative form of the gene encoding NAAP. Allelic variants may 

result from at least one mutation in the nucleic acid sequence and may result in altered mRNAs or in 
polypeptides whose structure or function may or may not be altered. A gene may have none, one, or 
many allelic variants of its naturally occurring form. Common mutational changes which give rise to 
allelic variants are generally ascribed to natural deletions, additions, or substitutions of nucleotides. 

15 Each of these types of changes may occur alone, or in combination with the others, one or more times 
in a given sequence. 

"Altered" nucleic acid sequences encoding NAAP include those sequences with deletions, 
insertions, or substitutions of different nucleotides, resulting in a polypeptide the same as NAAP or a 
polypeptide with at least one functional characteristic of NAAP. Included within this definition are 

20 polymorphisms which may or may not be readily detectable using a particular oligonucleotide probe of 
the polynucleotide encoding NAAP, and improper or unexpected hybridization to allelic variants, with a 
locus other than the normal chromosomal locus for the polynucleotide encoding NAAP. The encoded 
protein may also be "altered," and may contain deletions, insertions, or substitutions of amino acid 
residues which produce a silent change and result in a functionally equivalent NAAP. Deliberate 

25 amino acid substitutions may be made on the basis of one or more similarities in polarity, charge, 

solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues, as long as the 
biological or immunological activity of NAAP is retained. For example, negatively charged amino 
acids may include aspartic acid and glutamic acid, and positively charged amino acids may include 
lysine and arginine. Amino acids with uncharged polar side chains having similar hydrophilicity values 

30 may include: asparagine and glutamine; and serine and threonine. Amino acids with uncharged side 
chains having similar hydrophilicity values may include: leucine, isoleucine, and valine; glycine and 
alanine; and phenylalanine and tyrosine. 

The terms "amino acid" and "amino acid sequence" can refer to an oligopeptide, a peptide, a 
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polypeptide, or a protein sequence, or a fragment of any of these, and to naturally occurring or 
synthetic molecules. Where "amino acid sequence" is recited to refer to a sequence of a naturally 
occurring protein molecule, "amino acid sequence" and like terms are not meant to limit the amino acid 
sequence to the complete native amino acid sequence associated with the recited protein molecule. 
5 "Amplification" relates to the production of additional copies of a nucleic acid. Amplification 

maybe carried out using polymerase chain reaction (PCR) technologies or other nucleic acid 
amplification technologies well known in the art- 
Hie term "antagonist" refers to a molecule which inhibits or attenuates the biological activity 
of NAAP. Antagonists may include proteins such as antibodies, anticalins, nucleic acids, 
10 carbohydrates, small molecules, or any other compound or composition which modulates the activity of 
NAAP either by directly interacting with NAAP or by acting on components of the biological pathway 
in which NAAP participates. 

The term "antibody" refers to intact immunoglobulin molecules as well as to fragments 
thereof, such as Fab, F(ab% and Fv fragments, which are capable of binding an epitopic determinant 
15 Antibodies that bind NAAP polypeptides can be prepared using intact polypeptides or using fragments 
containing small peptides of interest as the immunizing antigen. The polypeptide or oligopeptide used 
to immunize an animal (e.g., a mouse, a rat, or a rabbit) can be derived from the translation of RNA, 
or synthesized chemically, and can be conjugated to a carrier protein if desired. Commonly used 
carriers that are chemically coupled to peptides include bovine serum albumin, thyroglobulin, and 
20 ke5*Lole limpet hemocyanin (KLH). The coupled peptide is then used to immunize the animal. 

The term "antigenic determinant" refers to that region of a molecule (i.e., an epitope) that 
makes contact with a particular antibody. When a protein or a fragment of a protein is used to 
immunize a host animal, numerous regions of the protein may induce the production of antibodies 
which bind specifically to antigenic determinants (particular regions or three-dimensional structures on 
25 the protein). An antigenic determinant may compete with the intact antigen (i.e., the immunogen used 
to elicit the immune response) for binding to an antibody. 

The term "aptamer" refers to a nucleic acid or oligonucleotide molecule that binds to a 
specific molecular target Aptamers are derived from an in vitro evolutionary process (e.g., SELEX 
(Systematic Evolution of ligands by Exponential Enrichment), described in U.S. Patent No. 
30 5,270,163), which selects for target-specific aptamer sequences from large combinatorial libraries. 
Aptamer compositions maybe double-stranded or single-stranded, and may include 
deoxyribonucleotides, ribonucleotides, nucleotide derivatives, or other nucleotide-like molecules. The 
nucleotide components of an aptamer may have modified sugar groups (e.g., the 2-OH group of a 
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ribonucleotide may be replaced by 2 -F or 2 -NH^), which may improve a desired property, e.g., 
resistance to nucleases or longer lifetime in blood. Aptamers may be conjugated to other molecules, 
e.g., a high molecular weight carrier to slow clearance of the aptamer from the circulatory system. 
Aptamers may be specifically cross-linked to their cognate ligands, e.g., by photo-activation of a 
5 cross-linker (Brody, E.N. and L. Gold (2000) J. Biotechnol. 74:5-13). 

The term "intramer" refers to an aptamer which is expressed in vivo. For example, a 
vaccinia virus-based RNA expression system has been used to express specific RNA aptamers at 
high levels in the cytoplasm of leukocytes (Blind, M. et al. (1999) Proc. NatL Acad. Sci. USA 
96:3606-3610). 

10 The term "spiegelmer" refers to an aptamer which includes L-DNA, L-RNA, or other left- 

handed nucleotide derivatives or nucleotide-like molecules. Aptamers containing left-handed 
nucleotides are resistant to degradation by naturally occurring enzymes, which normally act on 
substrates containing right-handed nucleotides. 

The term "antisense" refers to any composition capable of base-pairing with the "sense" 

15 (coding) strand of a polynucleotide having a specific nucleic acid sequence. Antisense compositions 
may include DNA; RNA; peptide nucleic acid (PNA); oligonucleotides having modified backbone 
linkages such as phosphorothioates, methylphosphonates, or benzylphosphonates; oligonucleotides 
having modified sugar groups such as 2'-methoxyethyl sugars or 2 -methoxyethoxy sugars; or 
oligonucleotides having modified bases such as 5-methyl cytosine, 2-deoxyuracil, or 7-deaza-2 f - 

20 deoxyguanosine. Antisense molecules may be produced by any method including chemical synthesis 
or transcription. Once introduced into a cell, the complementary antisense molecule base-pairs with a 
naturally occurring nucleic acid sequence produced by the cell to form duplexes which block either 
transcription or translation. The designation "negative" or "minus" can refer to the antisense strand, 
and the designation "positive" or "plus" can refer to the sense strand of a reference DNA molecule. 

25 The term 'biologically active" refers to a protein having structural, regulatory, or biochemical 

functions of a naturally occurring molecule. Likewise, "immunologically active" or "immunogenic" 
refers to the capability of the natural, recombinant, or synthetic NAAP, or of any oligopeptide thereof, 
to induce a specific immune response in appropriate animals or cells and to bind with specific 
antibodies. 

30 "Complementary" describes the relationship between two single-stranded nucleic acid 

sequences that anneal by base-pairing. For example, 5-AGT-3 1 pairs with its complement, 
3'-TCA-5\ 

A "composition comprising a given polynucleotide" and a "composition comprising a given 
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polypeptide" can refer to any composition containing the given polynucleotide or polypeptide. The 
composition may comprise a dry formulation or an aqueous solution. Compositions comprising 
polynucleotides encoding NAAP or fragments of NAAP maybe employed as hybridization probes. 
The probes may be stored in freeze-dried form and may be associated with a stabilizing agent such as 

5 a carbohydrate. In hybridizations, the probe may be deployed in an aqueous solution containing salts 
(e.g., NaCl), detergents (e.g., sodium dodecyl sulfate; SDS), and other components (e.g., Denhardt's 
solution, dry milk, salmon sperm DNA, etc.). 

"Consensus sequence" refers to a nucleic acid sequence which has been subjected to 
repeated DNA sequence analysis to resolve uncalled bases, extended using the XL-PCR kit (Applied 

10 Biosystems, Foster City CA) in the 5' and/or the 3' direction, and resequenced, or which has been 

assembled from one or more overlapping cDNA, EST, or genomic DNA fragments using a computer 
program for fragment assembly, such as the GELVTEW fragment assembly system (GCG, Madison 
WI) or Phrap (University of Washington, Seattle WA). Some sequences have been both extended 
and assembled to produce the consensus sequence. 

15 "Conservative amino acid substitutions" are those substitutions that are predicted to least 

interfere with the properties of the original protein, i.e., the structure and especially the function of the 
protein is conserved and not significantly changed by such substitutions. The table below shows amino 
acids which may be substituted for an original amino acid in a protein and which are regarded as 
conservative amino acid substitutions. 



20 


Original Residue 


Conservative Substitution 




Ala 


Gly, Ser 




Arg 


His, Lys 




Asn 


Asp, Gin, His 




Asp 


Asn, Glu 


25 


Cys 


Ala, Ser 




Gin 


Asn, Glu, His 




Glu 


Asp, Gin, His 




Gly 


Ala 




His 


Asn, Arg, Gin, Ghi 


30 


He 


Leu, Val 




Leu 


lie, Val 




Lys 


Arg, Gin, Glu 




Met 


Leu, lie 




Phe 


His, Met, Leu, Trp, Tyr 


35 


Ser 


Cys, Thr 




Thr 


Ser, Val 




Tip 


Phe, Tyr 




Tyr 


His, Phe, Trp 




Val 


lie, Leu, Thr 
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Conservative amino acid substitutions generally maintain (a) the structure of the polypeptide 
backbone in the area of the substitution, for example, as a beta sheet or alpha helical conformation, 
(b) the charge or hydrophobicity of the molecule at the site of the substitution, and/or (c) the bulk of 
the side chain. 

5 A "deletion" refers to a change in the amino acid or nucleotide sequence that results in the 

absence of one or more amino acid residues or nucleotides. 

The term "derivative" refers to a chemically modified polynucleotide or polypeptide. 
Chemical modifications of a polynucleotide can include, for example, replacement of hydrogen by an 
alkyl, acyl, hydroxyl, or amino group. A derivative polynucleotide encodes a polypeptide which retains 
10 at least one biological or immunological function of the natural molecule. A derivative polypeptide is 
one modified by glycosylation, pegylation, or any similar process that retains at least one biological or 
immunological function of the polypeptide from which it was derived. 

A "detectable label" refers to a reporter molecule or enzyme that is capable of generating a 
measurable signal and is covalently or noncovalently joined to a polynucleotide or polypeptide. 
15 "Differential expression" refers to increased or upregulated; or decreased, downregulated, or 

absent gene or protein expression, determined by comparing at least two different samples. Such 
comparisons may be carried out between, for example, a treated and an untreated sample, or a 
diseased and a normal sample. 

"Exon shuffling" refers to the recombination of different coding regions (exons). Since an 
20 exon may represent a structural or functional domain of the encoded protein, new proteins may be 
assembled through the novel reassortment of stable substructures, thus allowing acceleration of the 
evolution of new protein functions. 

A "fragment" is a unique portion of NAAP or a polynucleotide encoding NAAP which can be 
identical in sequence to, but shorter in length than, the parent sequence. A fragment may comprise up 
25 to the entire length of the defined sequence, minus one nucleotide/amino acid residue. For example, a 
fragment may comprise from about 5 to about 1000 contiguous nucleotides or amino acid residues. A 
fragment used as a probe, primer, antigen, therapeutic molecule, or for other purposes, maybe at least 
5, 10, 15, 16, 20, 25, 30, 40, 50, 60, 75, 100, 150, 250 or at least 500 contiguous nucleotides or amino 
acid residues in length. Fragments may be preferentially selected from certain regions of a molecule. 
30 For example, a polypeptide fragment may comprise a certain length of contiguous amino acids 

selected from the first 250 or 500 amino acids (or first 25% or 50%) of a polypeptide as shown in a 
certain defined sequence. Clearly these lengths are exemplary, and any length that is supported by the 
specification, including the Sequence Listing, tables, and figures, maybe encompassed by the present 
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embodiments. 

A fragment of SEQ ID NO:37-72 can comprise a region of unique polynucleotide sequence 
that specifically identifies SEQ ID NO:37-72, for example, as distinct from any other sequence in the 
genome from which the fragment was obtained. A fragment of SEQ ID NO:37-72 can be employed 
5 in one or more embodiments of methods of the invention, for example, in hybridization and 

amplification technologies and in analogous methods that distinguish SEQ ID NO:37-72 from related 
polynucleotides. The precise length of a fragment of SEQ ID NO:37-72 and the region of SEQ ID 
NO:37-72 to which the fragment corresponds are routinely determinable by one of ordinary skill in the 
art based on the intended purpose for the fragment 
10 A fragment of SEQ ED NO:l-36 is encoded by a fragment of SEQ ID NO:37-72. A 

fragment of SEQ ID NO:l-36 can comprise a region of unique amino acid sequence that specifically 
identifies SEQ ID NO:l-36. For example, a fragment of SEQ ID NO:l-36 can be used as an 
immunogenic peptide for the development of antibodies that specifically recognize SEQ ID NO:l-36. 
The precise length of a fragment of SEQ ID NO:l-36 and the region of SEQ ID NO:l-36 to which 
15 the fragment corresponds can be determined based on the intended purpose for the fragment using 
one or more analytical methods described herein or otherwise known in the art. 

A "full length" polynucleotide is one containing at least a translation initiation codon (e.g., 
methionine) followed by an open reading frame and a translation termination codon. A "full length" 
polynucleotide sequence encodes a "full length" polypeptide sequence. 
20 'Homology" refers to sequence similarity or, interchangeably, sequence identity, between two 

or more polynucleotide sequences or two or more polypeptide sequences. 

The terms "percent identity" and "% identity," as applied to polynucleotide sequences, refer to 
the percentage of residue matches between at least two polynucleotide sequences aligned using a 
standardized algorithm. Such an algorithm may insert, in a standardized and reproducible way, gaps in 
25 the sequences being compared in order to optimize alignment between two sequences, and therefore 
achieve a more meaningful comparison of the two sequences. 

Percent identity between polynucleotide sequences may be determined using one or more 
computer algorithms or programs known in the art or described herein. For example, percent identity 
can be determined using the default parameters of the CLUSTAL V algorithm as incorporated into 
30 the MEGALIGN version 3 . 12e sequence alignment program. This program is part of the 

LASERGENE software package, a suite of molecular biological analysis programs (DNASTAR, 
Madison WI). CLUSTAL V is described in Higgins and Sharp (1989; CABIOS 5:151-153) and in 
Higgins et al. (1992; CABIOS 8:189-191). For pairwise alignments of polynucleotide sequences, the 
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default parameters are set as follows: Ktuple=2, gap penalty=5, window=4, and "diagonals saved"=4. 
The "weighted" residue weight table is selected as the default. Percent identity is reported by 
CLUSTAL V as the "percent similarity 7 ' between aligned polynucleotide sequences. 

Alternatively, a suite of commonly used and freely available sequence comparison algorithms 
5 which can be used is provided by the National Center for Biotechnology Information (NCBI) Basic 

Local Alignment Search Tool (BLAST) (Altschul, S.R et aL (1990) J. Mol. Biol. 215:403-410), which 
is available from several sources, including the NCBI, Bethesda, MD, and on the Internet at 
http://www.ncbi.nlm.nih.gov/BLAST/. The BLAST software suite includes various sequence analysis 
programs including "blastn," that is used to align a known polynucleotide sequence with other 
10 polynucleotide sequences from a variety of databases. Also available is a tool called "BLAST 2 
Sequences" that is used for direct pairwise comparison of two nucleotide sequences. "BLAST 2 
Sequences" can be accessed and used interactively at http://www.ncbi.nlm.nih.gov/gorf/bl2.html. The 
"BLAST 2 Sequences" tool can be used for both blastn and blastp (discussed below). BLAST 
programs are commonly used with gap and other parameters set to default settings. For example, to 
15 compare two nucleotide sequences, one may use blastn with the "BLAST 2 Sequences" tool Version 
2.0.12 (April-2 1-2000) set at default parameters. Such default parameters maybe, for example: 

Matrix: BLOSUM62 

Reward for match: 1 

Penalty for mismatch: -2 
20 Open Gap: 5 and Extension Gap: 2 penalties 

Gap x drop-off: 50 

Expect: 10 

Word Size: 11 

Filter: on 

25 Percent identity may be measured over the length of an entire defined sequence, for example, 

as defined by a particular SEQ ID number, or may be measured over a shorter length, for example, 
over the length of a fragment taken from a larger, defined sequence, for instance, a fragment of at 
least 20, at least 30, at least 40, at least 50, at least 70, at least 100, or at least 200 contiguous 
nucleotides. Such lengths are exemplary only, and it is understood that any fragment length supported 

30 by the sequences shown herein, in the tables, figures, or Sequence Listing, may be used to describe a 
length over which percentage identity may be measured. 

Nucleic acid sequences that do not show a high degree of identity may nevertheless encode 
similar amino acid sequences due to the degeneracy of the genetic code. It is understood that changes 
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in a nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid 
sequences that all encode substantially the same protein. 

The phrases "percent identity" and "% identity," as applied to polypeptide sequences, refer to 
the percentage of residue matches between at least two polypeptide sequences aligned using a 
5 standardized algorithm. Methods of polypeptide sequence alignment are well-known Some alignment 
methods take into account conservative amino acid substitutions. Such conservative substitutions, 
explained in more detail above, generally preserve the charge andjhydrophobicity at the site of 
substitution, thus preserving the structure (and therefore function) of the polypeptide. 

Percent identity between polypeptide sequences may be determined using the default 
10 parameters of the CLUSTAL V algorithm as incorporated into the MEGAI2GN version 3 . 12e 
sequence alignment program (described and referenced above). For pairwise alignments of 
polypeptide sequences using CLUSTAL V, the default parameters are set as follows: Ktuple=l, gap 
penalty=3, window=5, and "diagonals saved' =5. The PAM250 matrix is selected as the default 
residue weight table. As with polynucleotide alignments, the percent identity is reported by 
15 CLUSTAL V as the "percent similarity" between aligned polypeptide sequence pairs. 

Alternatively the NCBI BLAST software suite may be used. For example, for a pairwise 
comparison of two polypeptide sequences, one may use the "BLAST 2 Sequences" tool Version 
2.0.12 (April-21-2000) withblastp set at default parameters. Such default parameters maybe, for 
example: 
20 Matrix: BLOSUM62 

Open Gap: 11 and Extension Gap: 1 penalties 
Gap x drop-off: 50 
Expect: 10 
Word Size: 3 
25 Filter: on 

Percent identity maybe measured over the length of an entire defined polypeptide sequence, 
for example, as defined by a particular SEQ ID number, or may be measured over a shorter length, 
for example, over the length of a fragment taken from a larger, defined polypeptide sequence, for 
instance, a fragment of at least 15, at least 20, at least 30, at least 40, at least 50, at least 70 or at least 
30 150 contiguous residues. Such lengths are exemplary only, and it is understood that any fragment 
length supported by the sequences shown herein, in the tables, figures or Sequence Listing, may be 
used to describe a length over which percentage identity may be measured. 

* 'Human artificial chromosomes" (HACs) are linear microchromosomes which may contain 
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DNA sequences of about 6 kb to 10 Mb in size and which contain all of the elements required for 
chromosome replication, segregation and maintenance. 

The term tc humanized antibody" refers to an antibody molecule in which the amino acid 
sequence in the non-antigen binding regions has been altered so that the antibody more closely 
5 resembles a human antibody, and still retains its original binding ability. 

"Hybridization" refers to the process by which a polynucleotide strand anneals with a 
complementary strand through base pairing under defined hybridization conditions. Specific 
hybridization is an indication that two nucleic acid sequences share a high degree of complementarity. 
Specific hybridization complexes form under permissive annealing conditions and remain hybridized 
10 after the "washing" step(s). The washing step(s) is particularly important in determining the 

stringency of the hybridization process, with more stringent conditions allowing less non-specific 
binding, i.e., binding between pairs of nucleic acid strands that are not perfectly matched. Permissive 
conditions for annealing of nucleic acid sequences are routinely determinable by one of ordinary skill in 
the art and may be consistent among hybridization experiments, whereas wash conditions may be 
15 varied among experiments to achieve the desired stringency, and therefore hybridization specificity. 

Permissive annealing conditions occur, for example, at 68°C in the presence of about 6 x SSC, about 
1% (w/v) SDS, and about 100 /xg/ml sheared, denatured salmon sperm DNA. 

Generally, stringency of hybridization is expressed, in part, with reference to the temperature 
under which the wash step is carried out. Such wash temperatures are typically selected to be about 
20 5°C to 20°C lower than the thermal melting point (TjJ for the specific sequence at a defined ionic 
strength and pH. The T m is the temperature (under defined ionic strength and pH) at which 50% of 
the target sequence hybridizes to a perfectly matched probe. An equation for calculating T m and 
conditions for nucleic acid hybridization are well known and can be found in Sambrook et al. (1989; 
Molecular r irm W: A Laboratory Manual , 2 nd ed., vol. 1-3, Cold Spring Harbor Press, Plainview NY; 
25 specifically see volume 2, chapter 9). 

High stringency conditions for hybridization between polynucleotides of the present invention 
include wash conditions of 68°C in the presence of about 0.2 x SSC and about 0.1% SDS, for 1 hour. 
Alternatively, temperatures of about 65°C, 60°C, 55°C, or 42 °C maybe used. SSC concentration may 
be varied from about 0.1 to 2 x SSC, with SDS being present at about 0.1%. Typically, blocking 
30 reagents are used to block non-specific hybridization. Such blocking reagents include, for instance, 
sheared and denatured salmon sperm DNA at about 100-200 /xg/ml. Organic solvent, such as 
formamide at a concentration of about 35-50% v/v, may also be used under particular circumstances, 
such as for RNA:DNA hybridizations. Useful variations on these wash conditions will be readily 
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apparent to those of ordinary skill in the art Hybridization, particularly under high stringency 
conditions, may be suggestive of evolutionary similarity between the nucleotides. Such similarity is 
strongly indicative of a similar role for the nucleotides and their encoded polypeptides. 

The term hybridization complex" refers to a complex formed between two nucleic acids by 
5 virtue of the formation of hydrogen bonds between complementary bases. A hybridization complex 
may be formed in solution (e.g., C 0 t or I^t analysis) or formed between one nucleic acid present in 
solution and another nucleic acid immobilized on a solid support (e.g., paper, membranes, filters, chips, 
pins or glass slides, or any other appropriate substrate to which cells or their nucleic acids have been 
fixed). 

10 The words "insertion" and "addition" refer to changes in an amino acid or polynucleotide 

sequence resulting in the addition of one or more amino acid residues or nucleotides, respectively. 

'Immune response" can refer to conditions associated with inflammation, trauma, immune 
disorders, or infectious or genetic disease, etc. These conditions can be characterized by expression 
of various factors, e.g., cytokines, chemokines, and other signaling molecules, which may affect 
. 15 cellular and systemic defense systems. 

An "immunogenic fragment" is a polypeptide or oligopeptide fragment of NAAP which is 
capable of eliciting an immune response when introduced into a living organism, for example, a 
mammal The term "immunogenic fragment" also includes any polypeptide or oligopeptide fragment 
of NAAP which is useful in any of the antibody production methods disclosed herein or known in the 
20 art. 

The term "microarray" refers to an arrangement of a plurality of polynucleotides, 
polypeptides, antibodies, or other chemical compounds on a substrate. 

The terms "element 7 ' and "array element" refer to a polynucleotide, polypeptide, antibody, or 
other chemical compound having a unique and defined position on a microarray. 
25 The term "modulate" refers to a change in the activity of NAAP. For example, modulation 

may cause an increase or a decrease in protein activity, binding characteristics, or any other biological, 
functional, or immunological properties of NAAP. 

The phrases "nucleic acid" and "nucleic acid sequence" refer to a nucleotide, oligonucleotide, 
polynucleotide, or any fragment thereof. These phrases also refer to DNA or RNA of genomic or 
30 synthetic origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA), or to any DNA-like or RNA-like material. 

"Operably linked" refers to the situation in which a first nucleic acid sequence, is placed in a 
functional relationship with a second nucleic acid sequence. For instance, a promoter is operably 
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linked to a coding sequence if the promoter affects the transcription or expression of the coding 
sequence. Operably linked DNA sequences may be in close proximity or contiguous and, where 
necessary to join two protein coding regions, in the same reading frame. 

'Teptide nucleic acid" (PNA) refers to an antisense molecule or anti-gene agent which 
5 comprises an oligonucleotide of at least about 5 nucleotides in length linked to a peptide backbone of 
amino acid residues ending in lysine. The terminal lysine confers solubility to the composition. PNAs 
preferentially bind complementary single stranded DNA or RNA and stop transcript elongation, and 
may be pegylated to extend their lifespan in the celL 

'Tost-translational modification" of an NAAP may involve lipidation, glycosylation, 
10 phosphorylation, acetylation, racemization, proteolytic cleavage, and other modifications known in the 
art. These processes may occur synthetically or biochemically. Biochemical modifications will vary 
by cell type depending on the enzymatic milieu of NAAP. 

"Probe" refers to nucleic acids encoding NAAP, their complements, or fragments thereof, 
which are used to detect identical, allelic or related nucleic acids. Probes are isolated oligonucleotides 
15 or polynucleotides attached to a detectable label or reporter molecule. Typical labels include 

radioactive isotopes, ligands, chemiluminescent agents, and enzymes. 'Trimers" are short nucleic 
acids, usually DNA oligonucleotides, which may be annealed to a target polynucleotide by 
complementary base-pairing. The primer may then be extended along the target DNA strand by a 
DNA polymerase enzyme. Primer pairs can be used for amplification (and identification) of a nucleic 
20 acid, e.g. , by the polymerase chain reaction (PCR). 

Probes and primers as used in the present invention typically comprise at least 15 contiguous 
nucleotides of a known sequence. In order to enhance specificity, longer probes and primers may also 
be employed, such as probes and primers that comprise at least 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 
or at least 150 consecutive nucleotides of the disclosed nucleic acid sequences. Probes and primers 
25 may be considerably longer than these examples, and it is understood that any length supported by the 
specification, including the tables, figures, and Sequence Listing, maybe used. 

Methods for preparing and using probes and primers are described in the references, for 
example Sambrook et al (supra); Ausubel, F.M. et al. (1987; Current Protocols in Molecular Biology , 
Greene Publ. Assoc. & Wiley-Intersciences, New York NY); and Innis, M. et al. (1990; PCR 
30 Protocols, A Guide to Methods and Applications , Academic Press, San Diego CA). PCR primer pairs 
can be derived from a known sequence, for example, by using computer programs intended for that 
purpose such as Primer (Version 0.5, 1991, Whitehead Institute for Biomedical Research, Cambridge 
MA). 
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Oligonucleotides for use as primers are selected using software known in the art for such 
purpose. For example, OtJGO 4.06 software is useful for the selection of PCR primer pairs of up to 
100 nucleotides each, and for the analysis of oligonucleotides and larger polynucleotides of up to 5,000 
nucleotides from an input polynucleotide sequence of up to 32 kilobases. Similar primer selection 
5 programs have incorporated additional features for expanded capabilities. For example, the PrimOU 
primer selection program (available to the public from the Genome Center at University of Texas 
South West Medical Center, Dallas IX) is capable of choosing specific primers from megabase 
sequences and is thus useful for designing primers on a genome- wide scope. The Primer3 primer 
selection program (available to the public from the Whitehead Institute/MIT Center for Genome 
10 Research, Cambridge MA) allows the user to input a "mispriming library," in which sequences to 

avoid as primer binding sites are user-specified. Primer3 is useful, in particular, for the selection of 
oligonucleotides for micro arrays. (The source code for the latter two primer selection programs may 
also be obtained from their respective sources and modified to meet the user's specific needs.) The 
PrimeGen program (available to the public from the UK Human Genome Mapping Project Resource 
15 Centre, Cambridge UK) designs primers based on multiple sequence alignments, thereby allowing 
selection of primers that hybridize to either the most conserved or least conserved regions of aligned 
nucleic acid sequences. Hence, this program is useful for identification of both unique and conserved 
oligonucleotides and polynucleotide fragments. The oligonucleotides and polynucleotide fragments 
identified by any of the above selection methods are useful in hybridization technologies, for example, 
20 as PCR or sequencing primers, microarray elements, or specific probes to identify fully or partially 

complementary polynucleotides in a sample of nucleic acids. Methods of oligonucleotide selection are 
not limited to those described above. 

A "recombinant nucleic acid" is a nucleic acid that is not naturally occurring or has a 
sequence that is made by an artificial combination of two or more otherwise separated segments of 
25 sequence. This artificial combination is often accomplished by chemical synthesis or, more commonly, 
by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering 
techniques such as those described in Sambrook, supra. The term recombinant includes nucleic acids 
that have been altered solely by addition, substitution, or deletion of a portion of the nucleic acid. 
Frequently, a recombinant nucleic acid may include a nucleic acid sequence operably linked to a 
30 promoter sequence. Such a recombinant nucleic acid may be part of a vector that is used, for 
example, to transform a cell. 

Alternatively, such recombinant nucleic acids may be part of a viral vector, e.g., based on a 
vaccinia virus, that could be use to vaccinate a mammal wherein the recombinant nucleic acid is 
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expressed, inducing a protective immunological response in the mammal. 

A "regulatory element" refers to a nucleic acid sequence usually derived from untranslated 
regions of a gene and includes enhancers, promoters, introns, and 5' and 3' untranslated regions 
(UTRs). Regulatory elements interact with host or viral proteins which control transcription, 
5 translation, or RNA stability. 

"Reporter molecules" are chemical or biochemical moieties used for labeling a nucleic acid, 
amino acid, or antibody. Reporter molecules include radionuclides; enzymes; fluorescent, 
cheiriiluminescent, or chromogenic agents; substrates; cofactors; inhibitors; magnetic particles; and 
other moieties known in the art 
10 An "RNA equivalent," in reference to a DNA molecule, is composed of the same linear 

sequence of nucleotides as the reference DNA molecule with the exception that all occurrences of 
the nitrogenous base thymine are replaced with uracil, and the sugar backbone is composed of ribose 
instead of deoxyribose. 

The term "sample" is used in its broadest sense. A sample suspected of containing NAAP, 
15 nucleic acids encoding NAAP, or fragments thereof may comprise a bodily fluid; an extract from a 

cell, chromosome, organelle, or membrane isolated from a cell; a cell; genomic DNA, RNA, or cDNA, 
in solution or bound to a substrate; a tissue; a tissue print; etc. 

The terms "specific binding" and "specifically binding" refer to that interaction between a 
protein or peptide and an agonist, an antibody, an antagonist, a small molecule, or any natural or 
20 synthetic binding composition. The interaction is dependent upon the presence of a particular structure 
of the protein, e.g., the antigenic determinant or epitope, recognized by the binding molecule. For 
example, if an antibody is specific for epitope "A," the presence of a polypeptide comprising the 
epitope A, or the presence of free unlabeled A, in a reaction containing free labeled A and the 
antibody will reduce the amount of labeled A that binds to the antibody. 
25 The term "substantially purified" refers to nucleic acid or amino acid sequences that are 

removed from their natural environment and are isolated or separated, and are at least about 60% 
free, preferably at least about 75% free, and most preferably at least about 90% free from other 
components with which they are naturally associated. 

A "substitution" refers to the replacement of one or more amino acid residues or nucleotides 
30 by different amino acid residues or nucleotides, respectively. 

"Substrate" refers to any suitable rigid or semi-rigid support including membranes, filters, 
chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, 
microparticles and capillaries. The substrate can have a variety of surface forms, such as wells, 
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trenches, pins, channels and pores, to which polynucleotides or polypeptides are bound. 

A "transcript image" or "expression profile" refers to the collective pattern of gene expression 
by a particular cell type or tissue under given conditions at a given time. 

'Transformation" describes a process by which exogenous DNA is introduced into a recipient 

5 cell Transformation may occur under natural or artificial conditions according to various methods 
well known in the art, and may rely on any known method for the insertion of foreign nucleic acid 
sequences into a prokaryotic or eukaryotic host cell. The method for transformation is selected based 
on the type of host cell being transformed and may include, but is not limited to, bacteriophage or viral 
infection, electroporation, heat shock, lipofection, and particle bombardment The term "transformed 

10 cells" includes stably transformed cells in which the inserted DNA is capable of replication either as 
an autonomously replicating plasmid or as part of the host chromosome, as well as transiently 
transformed cells which express the inserted DNA or RNA for limited periods of time. 

A "transgenic organism," as used herein, is any organism, including but not limited to animals 
and plants, in which one or more of the cells of the organism contains heterologous nucleic acid 

15 introduced by way of human intervention, such as by transgenic techniques well known in the art The 
nucleic acid is introduced into the cell, directly or indirectly by introduction into a precursor of the cell, 
by way of deliberate genetic manipulation, such as by microinjection or by infection with a 
recombinant virus. In another embodiment, the nucleic acid can be introduced by infection with a 
recombinant viral vector, such as a lentiviral vector (Lois, C. et al. (2002) Science 295:868-872). The 

20 term genetic manipulation does not include classical cross-breeding, or in vitro fertilization, but rather 
is directed to the introduction of a recombinant DNA molecule. The transgenic organis ms 
contemplated in accordance with the present invention include bacteria, cyanobacteria, fungi, plants 
and animals. The isolated DNA of the present invention can be introduced into the host by methods 
known in the art, for example infection, transfection, transformation or transconjugation. Techniques 

25 for transferring the DNA of the present invention into such organisms are widely known and provided 
in references such as Sambrook et al., supra. 

A **variant" of a particular nucleic acid sequence is defined as a nucleic acid sequence having 
at least 40% sequence identity to the particular nucleic acid sequence over a certain length of one of 
the nucleic acid sequences using blastn with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 

30 1999) set at default parameters. Such a pair of nucleic acids may show, for example, at least 50%, at 
least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 
93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater 
sequence identity over a certain defined length. A variant may be described as, for example, an 
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"allelic" (as defined above), "splice," "species," or "polymorphic" variant. A splice variant may have 
significant identity to a reference molecule, but will generally have a greater or lesser number of 
polynucleotides due to alternate splicing of exons during mRNA processing. The corresponding 
polypeptide may possess additional functional domains or lack domains that are present in the 
5 reference molecule. Species variants are polynucleotides that vary from one species to another. The 
resulting polypeptides will generally have significant amino acid identity relative to each other. A 
polymorphic variant is a variation in the polynucleotide sequence of a particular gene between 
individuals of a given species. Polymorphic variants also may encompass "single nucleotide 
polymorphisms" (SNPs) in which the polynucleotide sequence varies by one nucleotide base. The 
10 presence of SNPs may be indicative of, for example, a certain population, a disease state, or a 
propensity for a disease state. 

A "variant" of a particular polypeptide sequence is defined as a polypeptide sequence having 
at least 40% sequence identity to the particular polypeptide sequence over a certain length of one of 
the polypeptide sequences using blastp with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 
15 1999) set at default parameters. Such a pair of polypeptides may show, for example, at least 50%, at 
least 60%, at least 70%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 
94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater sequence 
identity over a certain defined length of one of the polypeptides. 

THE INVENTION 

Various embodiments of the invention include new human nucleic acid-associated proteins 
(NAAP), the polynucleotides encoding NAAP, and the use of these compositions for the diagnosis, 
treatment, or prevention of cell proliferative, neurological, reproductive, developmental, 
autoimmune/inflammatory, and DNA repair disorders, and infections. 

Table 1 summarizes the nomenclature for the full length polynucleotide and polypeptide 
embodiments of the invention. Each polynucleotide and its corresponding polypeptide are correlated to 
a single Incyte project identification number (Incyte Project ID). Each polypeptide sequence is 
denoted by both a polypeptide sequence identification number (Polypeptide SEQ ID NO:) and an 
Incyte polypeptide sequence number (Incyte Polypeptide ID) as shown. Each polynucleotide 
sequence is denoted by both a polynucleotide sequence identification number (Polynucleotide SEQ ID 
NO:) and an Incyte polynucleotide consensus sequence number (Incyte Polynucleotide ID) as shown. 
Column 6 shows the Incyte ID numbers of physical, full length clones corresponding to polypeptide 
and polynucleotide embodiments. The full length clones encode polypeptides which have at least 95% 
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sequence identity to the polypeptides shown in column 3. 

Table 2 shows sequences with homology to the polypeptides of the invention as identified by 
BLAST analysis against the GenBank protein (genpept) database and the PROTEOME database. 
Columns 1 and 2 show the polypeptide sequence identification number (Polypeptide SEQ ID NO:) and 
5 the corresponding Incyte polypeptide sequence number (Incyte Polypeptide ID) for polypeptides of the 
invention. Column 3 shows the GenBank identification number (GenBank ID NO:) of the nearest 
GenBank homolog and the PROTEOME database identification numbers (PROTEOME ID NO:) of 
the nearest PROTEOME database homologs. Column 4 shows the probability scores for the matches 
between each polypeptide and its homolog(s). Column 5 shows the annotation of the GenBank and 
10 PROTEOME database homolog(s) along with relevant citations where applicable, all of which are 
expressly incorporated by reference herein. 

Table 3 shows various structural features of the polypeptides of the invention. Columns 1 and 
2 show the polypeptide sequence identification number (SEQ ID NO:) and the corresponding Incyte 
polypeptide sequence number (Incyte Polypeptide ID) for each polypeptide of the invention. Column 
15 3 shows the number of amino acid residues in each polypeptide. Column 4 shows potential 

phosphorylation sites, and column 5 shows potential glycosylation sites, as determined by the MOTIFS 
program of the GCG sequence analysis software package (Genetics Computer Group, Madison WI). 
Column 6 shows amino acid residues comprising signature sequences, domains, and motifs. Column 7 
shows analytical methods for protein structure/function analysis and in some cases, searchable 
20 databases to which the analytical methods were applied. 

Together, Tables 2 and 3 summarize the properties of polypeptides of the invention, and these 
properties establish that the claimed polypeptides are nucleic acid-associated proteins. For example, 
SEQ ID NO:l is 88% identical, from residue Ml to residue L304, to mouse genomic screen homeobox 
protein 2 (GenBank ID gl042009) as determined by the Basic Local Alignment Search Tool 
25 (BLAST). The BLAST probability score is 2. le-146, which indicates the probability of obtaining the 
observed polypeptide sequence alignment by chance. SEQ ID NO:l also contains a homeobox 
domain as determined by searching for statistically significant matches in the hidden Markov model 
(HMM)-based PFAM database of conserved protein family domains. (See Table 3.) Data from 
BLIMPS, MOTIFS, PROFELESCAN, and additional BLAST analyses provide further corroborative 
30 evidence that SEQ ID NO:l is a homeobox protein. 

In an alternative example, SEQ ID NO:8 is 99% identical, from residue G24 to residue E384, 
to human DNA-binding protein B (GenBank ID gl81486) as determined by the Basic Local 
Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability score is 5.8e-199, which 
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indicates the probability of obtaining the observed polypeptide sequence alignment by chance. SEQ 
ID NO:8 also contains 'cold-shock' DNA-binding domain as determined by searching for statistically 
significant matches in the hidden Markov model (HMM)-based PFAM database of conserved protein 
family domains. (See Table 3.) Data from BLIMPS, MOTIFS, and PROFELESCAN analyses 
5 provide further corroborative evidence that SEQ ID NO: 8 is a DNA-binding protein. 

In another example, SEQ ID NO:13 is 94% identical, from residue M780 to residue E1598, to 
human centriole associated protein CEP110 (GenBank ID g3435244) as determined by the Basic 
Local Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability score is 0.0, which 
indicates the probability of obt aining the observed polypeptide sequence alignment by chance. Data 
10 from MOTIFS and further BLAST analyses provide corroborative evidence that SEQ ID NO: 13 is a 
centriole associated protein. 

In yet another example, SEQ ID NO: 15 is 87% identical, from residue E435 to residue L2523, 
to human bromodomain PHD finger transcription factor (GenBank ID g66 83492) as determined by the 
Basic Local Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability score is 0.0, 
15 which indicates the probability of obtaining the observed polypeptide sequence alignment by chance. 
SEQ ID NO: 15 also contains a PHD finger domain and a bromodomain as determined by searching 
for statistically significant matches in the hidden Markov model (THMM)-based PFAM database of 
conserved protein family domains. (See Table 3.) Data from BLIMPS, MOTIFS, and 
PROFILESCAN analyses provide further corroborative evidence that SEQ ID NO: 15 is a 
20 bromodomain PHD finger transcription factor. 

In an alternative example, SEQ ID NO:21 is 41% identical, from residue T141 to residue 
E370, to human Kurppel-like zinc finger protein HZF2 (GenBank ID g8163824) as determined by the 
Basic Local Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability score is 6.7e- 
71, which indicates the probability of obtaining the observed polypeptide sequence alignment by 
25 chance. SEQ ID NO:21 also contains Zinc finger C2H2 type domains as determined by searching for 
statistically significant matches in the hidden Markov model (HMM)-based PFAM database of 
conserved protein family domains. (See Table 3.) Data from BLIMPS and MOTIFS analyses 
provide further corroborative evidence that SEQ ID NO:21 is a C2H2 type zinc finger protein. 

In yet another example, SEQ ID NO:30 is 33% identical, from residue T556 to residue E1699, 
30 and 32% identical, from residue S 10 to Y21 1, to Schizosaccharoinyces pombe putative helicase 

(GenBank ID g6901 197) as determined by the Basic Local Alignment Search Tool (BLAST). (See 
Table 2.) The BLAST probability score is 2.9e-137, which indicates the probability of obtaining the 
observed polypeptide sequence alignment by chance. SEQ ID NO:30 also contains a DEAD/DEAH 
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box helicase domain as determined by searching for statistically significant matches in the hidden 
Markov model (HMM)-based PFAM database of conserved protein family domains. (See Table 3.) 
Data from BLAST analyses of the PRODOM and DOMO databases provide further corroborative 
evidence that SEQ ID NO:30 is a helicase. 

In yet another example, SEQ ID NO:33 is 97% identical, from residue Ml to residue V602, a 
murine T-box transcription factor (GenBank ID g3 169261) as determined by the Basic Local 
Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability score is 0.0, which 
indicates the probability of obtaining the observed polypeptide sequence alignment by chance. SEQ 
ID NO:33 also contains a T-box domain as determined by searching for statistically significant 
matches in the hidden Markov model (HMM)-based PFAM database of conserved protein family 
domains. (See Table 3.) Data from BLIMPS, PRODOM and DOMO BLAST, and MOTIFS 
analyses provide further corroborative evidence that SEQ ID NO:33 is a transcription factor molecule. 

Taken together, the foregoing provides evidence that SEQ ID NO:l, SEQ ID NO:8, SEQ ID 
NO:13, SEQ ID NO:15, SEQ ID NO:21, SEQ ID NO:30, and SEQ ID NO:33 are all molecules 
associated with nucleic acids. SEQ ID NO:2-7, SEQ ID NO:9-12, SEQ ID NO:14, SEQ ID NO:16- 
20, SEQ ID NO:22-29, SEQ ID NO:31-32, and SEQ ID NO:34-36 were analyzed and annotated in a 
similar manner. The algorithms and parameters for the analysis of SEQ ID NO: 1-3 6 are described in 
Table 7. 

As shown in Table 4, the full length polynucleotide embodiments were assembled using cDNA 
sequences or coding (exon) sequences derived from genomic DNA, or any combination of these two 
types of sequences. Column 1 lists the polynucleotide sequence identification number (Polynucleotide 
SEQ ID NO:), the corresponding Incyte polynucleotide consensus sequence number (Incyte ID) for 
each polynucleotide of the invention, and the length of each polynucleotide sequence inbasepairs. 
Column 2 shows the nucleotide start (5') and stop (3') positions of the cDNA and/or genomic 
sequences used to assemble the full length polynucleotide embodiments, and of fragments of the 
polynucleotides which are useful, for example, in hybridization or amplification technologies that 
identify SEQ ID NO:37-72 or that distinguish between SEQ ID NO:37-72 and related polynucleotides. 

The polynucleotide fragments described in Column 2 of Table 4 may refer specifically, for 
example, to Incyte cDNAs derived from tissue-specific cDNA libraries or from pooled cDNA 
libraries. Alternatively, the polynucleotide fragments described in column 2 may refer to GenBank 
cDNAs or ESTs which contributed to the assembly of the full length polynucleotides. In addition, the 
polynucleotide fragments described in column 2 may identify sequences derived from the ENSEMBL 
(The Sanger Centre, Cambridge, UK) database (Le. 9 those sequences including the designation 
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"ENST"). Alternatively, the polynucleotide fragments described in column 2 may be derived from the 
NCBI RefSeq Nucleotide Sequence Records Database (*.e., those sequences including the 
designation "NM" or "NT') or the NCBI RefSeq Protein Sequence Records those sequences 
including the designation "NF')- Alternatively, the polynucleotide fragments described in column 2 

5 may refer to assemblages of both cDNA and Genscan-predicted exons brought together by an "exon 
stitching" algorithm. For example, a polynucleotide sequence identified as 
FL_XXXXXX_N 1 _N 2 _YYYYY_N 3 _N 4 represents a "stitched" sequence in which XXXXXX is the 
identification number of the cluster of sequences to which the algorithm was applied, and YYYYY is the 
number of the prediction generated by the algorithm, and N If2t3 _, if present, represent specific exons 

10 that may have been manually edited during analysis (See Example V). Alternatively, the 

polynucleotide fragments in column 2 may refer to assemblages of exons brought together by an 
"exon-stretching" algorithm. For example, a polynucleotide sequence identified as 
FLJOOOOCX__gAAAAA ___gJBBBBB_lJsT is a "stretched" sequence, with XXXXXX being the Incyte 
project identification number, gAAAAA being the GenBank identification number of the human 

15 genomic sequence to which the "exon-stretching" algorithm was applied, gfiBBBB being the GenBank 
identification number or NCBI RefSeq identification number of the nearest GenBank protein homolog, 
and Preferring to specific exons (See Example V). In instances where a RefSeq sequence was used 
as a protein homolog for the "exon-stretching" algorithm, a RefSeq identifier (denoted by "NM" 
"NP," or "NT") maybe used in place of the GenBank identifier (Le. 9 gBBBBB). 

20 Alternatively, a prefix identifies component sequences that were hand-edited, predicted from 

genomic DNA sequences, or derived from a combination of sequence analysis methods. The 
following Table lists examples of component sequence prefixes and corresponding sequence analysis 
methods associated with the prefixes (see Example IV and Example V). 



Prefix 


Type of analysis and/or examples of programs 


GNN, GFG, 
ENST 


Exon prediction from genomic sequences using, for example, 
GENSCAN (Stanford University, CA, USA) or FGENES 
(Computer Genomics Group, The Sanger Centre, Cambridge, UK) 


GBI 


Hand-edited analysis of genomic sequences. 


EL 


Stitched or stretched genomic sequences (see Example V). 


INCY 


Full length transcript and exon prediction from mapping of EST 
sequences to the genome. Genomic location and EST composition 
data are combined to predict the exons and resulting transcript. 
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In some cases, Incyte cDNA coverage redundant with the sequence coverage shown in 
Table 4 was obtained to confirm the final consensus polynucleotide sequence, but the relevant Incyte 
cDNA identification numbers are not shown. 

Table 5 shows the representative cDNA libraries for those fiill length polynucleotides which 
were assembled using Incyte cDNA sequences. The representative cDNA library is the Incyte 
cDNA library which is most frequently represented by the Incyte cDNA sequences which were used 
to assemble and confirm the above polynucleotides. The tissues and vectors which were used to 
construct the cDNA libraries shown in Table 5 are described in Table 6. 

Table 8 shows single nucleotide polymorphisms (SNPs) found in polynucleotide embodiments, 
along with allele frequencies in different human populations. Columns 1 and 2 show the polynucleotide 
sequence identification number (SEQ ID NO:) and the corresponding Incyte project identification 
number (PID) for polynucleotides of the invention. Column 3 shows the Incyte identification number 
for the EST in which the SNP was detected (EST ID), and column 4 shows the identification number 
for the SNP (SNP ID). Column 5 shows the position within the EST sequence at which the SNP is 
located (EST SNP), and column 6 shows the position of the SNP within the full-length polynucleotide 
sequence (CB1 SNP). Column 7 shows the allele found in the EST sequence. Columns 8 and 9 show 
the two alleles found at the SNP site. Column 10 shows the amino acid encoded by the codon 
including the SNP site, based jipon the allele found in the EST. Columns 1 1-14 show the frequency of 
allele 1 in four different human populations. 1 An entry of n/d (not detected) indicates that the 
frequency of allele 1 in the population was too low to be detected, while n/a (not available) indicates 
that the allele frequency was not determined for the population. 

The invention also encompasses NAAP variants. A preferred NAAP variant is one which 
has at least about 80%, or alternatively at least about 90%, or even at least about 95% amino acid 
sequence identity to the NAAP amino acid sequence, and which contains at least one functional or 
structural characteristic of NAAP. 

Various embodiments also encompass polynucleotides which encode NAAP. In a particular 
embodiment, the invention encompasses a polynucleotide sequence comprising a sequence selected 
from the group consisting of SEQ ID NO:37~72, which encodes NAAP. The polynucleotide 
sequences of SEQ ID NO:37-72, as presented in the Sequence Listing, embrace the equivalent RNA 
sequences, wherein occurrences of the nitrogenous base thymine are replaced with uracil, and the 
sugar backbone is composed of ribose instead of deoxyribose. 

The invention also encompasses variants of a polynucleotide encoding NAAP. In particular, 
such a variant polynucleotide will have at least about 70%, or alternatively at least about 85%, or even 
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at least about 95% polynucleotide sequence identity to a polynucleotide encoding NAAP. A particular 
aspect of the invention encompasses a variant of a polynucleotide comprising a sequence selected 
from the group consisting of SEQ ID NO:37-72 which has at least about 70%, or alternatively at least 
about 85%, or even at least about 95% polynucleotide sequence identity to a nucleic acid sequence 
selected from the group consisting of SEQ ID NO:37-72. Any one of the polynucleotide variants 
described above can encode a polypeptide which contains at least one functional or structural 
characteristic of NAAP. 

In addition, or in the alternative, a polynucleotide variant of the invention is a splice variant of a 
polynucleotide encoding NAAP. A splice variant may have portions which have significant sequence 
identity to a polynucleotide encoding NAAP, but will generally have a greater or lesser number of 
polynucleotides due to additions or deletions of blocks of sequence arising from alternate splicing of 
exons during mRNA processing. A splice variant may have less than about 70%, or alternatively less 
than about 60%, or alternatively less than about 50% polynucleotide sequence identity to a 
polynucleotide encoding NAAP over its entire length; however, portions of the splice variant will have 
at least about 70%, or alternatively at least about 85%, or alternatively at least about 95%, or 
alternatively 100% polynucleotide sequence identity to portions of the polynucleotide encoding NAAP. 
For example, a polynucleotide comprising a sequence of SEQ ID NO:72 is a splice variant of a 
polynucleotide comprising a sequence of SEQ ID NO:50. Any one of the splice variants described 
above can encode a polypeptide which contains at least one functional or structural characteristic of 
NAAP. 

It will be appreciated by those skilled in the art that as a result of the degeneracy of the 
genetic code, a multitude of polynucleotide sequences encoding NAAP, some bearing minimal 
similarity to the polynucleotide sequences of any known and naturally occurring gene, may be 
produced. Thus, the invention contemplates each and every possible variation of polynucleotide 
sequence that could be made by selecting combinations based on possible codon choices. These 
combinations are made in accordance with the standard triplet genetic code as applied to the 
polynucleotide sequence of naturally occurring NAAP, and all such variations are to be considered as 
being specifically disclosed. 

Although polynucleotides which encode NAAP and its variants are generally capable of 
hybridizing to polynucleotides encoding naturally occurring NAAP under appropriately selected 
conditions of stringency, it may be advantageous to produce polynucleotides encoding NAAP or its 
derivatives possessing a substantially different codon usage, e.g., inclusion of non-naturally occurring 
codons. Codons may be selected to increase the rate at which expression of the peptide occurs in a 
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particular prokaryotic or eukaryotic host in accordance with the frequency with which particular 
codons are utilized by the host. Other reasons for substantially altering the nucleotide sequence 
encoding NAAP and its derivatives without altering the encoded amino acid sequences include the 
production of RNA transcripts having more desirable properties, such as a greater half-life, than 
transcripts produced from the naturally occurring sequence. 

The invention also encompasses production of polynucleotides which encode NAAP and 
NAAP derivatives, or fragments thereof, entirely by synthetic chemistry. After production, the 
synthetic polynucleotide maybe inserted into any of the many available expression vectors and cell 
systems using reagents well known in the art Moreover, synthetic chemistry may be used to 
introduce mutations into a polynucleotide encoding NAAP or any fragment thereof. 

Embodiments of the invention can also include polynucleotides that are capable of hybridizing 
to the claimed polynucleotides, and, in particular, to those having the sequences shown in SEQ ID 
NO:37-72 and fragments thereof, under various conditions of stringency (Wahl, G.M. and S.L. Berger 
(1987) Methods EnzymoL 152:399-407; Kimmel, A.R. (1987) Methods Enzymol. 152:507-511). 
Hybridization conditions, including annealing and wash conditions, are described in *T)efinitions." 

Methods for DNA sequencing are well known in the art and may be used to practice any of 
the embodiments of the invention. The methods may employ such enzymes as the KLenow fragment 
of DNA polymerase I, SEQUENASE (US Biochemical, Cleveland OH), Taq polymerase (Applied 
Biosystems), thermostable T7 polymerase (Amersham Biosciences, Piscataway NJ), or combinations 
of polymerases and proofreading exonucleases such as those found in the ELONGASE amplification 
system (Invitrogen, Carlsbad CA). Preferably, sequence preparation is automated with machines 
such as the MICROLAB 2200 liquid transfer system (Hamilton, Reno NV), PTC200 thermal cycler 
(MI Research, Watertown MA) and ABI CATALYST 800 thermal cycler (Applied Biosystems). 
Sequencing is then carried out using either the ABI 373 or 377 DNA sequencing system (Applied 
Biosystems), the MEGABACE 1000 DNA sequencing system (Amersham Biosciences), or other 
systems known in the art The resulting sequences are analyzed using a variety of algorithms which 
are well known in the art (Ausubel, F.M. (1997) Short Protocols in Molecular Biology , John Wiley & 
Sons, New York NY, unit 7.7; Meyers, R.A. (1995) Molecular Biology and Biotechnology , Wiley 
VCH, New York NY, pp. 856-853). 

The nucleic acids encoding NAAP may be extended utilizing a partial nucleotide sequence 
and employing various PCR-based methods known in the art to detect upstream sequences, such as 
promoters and regulatory elements. For example, one method which may be employed, restriction-site 
PCR, uses universal and nested primers to amplify unknown sequence from genomic DNA within a 
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cloning vector (Sarkar, G. (1993) PCR Methods Applic. 2:318-322). Another method, inverse PGR, 
uses primers that extend in divergent directions to amplify unknown sequence from a circularized 
template. The template is derived from restriction fragments comprising a known genomic locus and 
surrounding sequences (Triglia, T. et al. (1988) Nucleic Acids Res. 16:8186). A third method, capture 
PCR, involves PCR amplification of DNA fragments adjacent to known sequences in human and 
yeast artificial chromosome DNA (Lagerstrom, M. et aL (1991) PCR Methods Applic. 1:111-119). In 
this method, multiple restriction enzyme digestions and ligations may be used to insert an engineered 
double-stranded sequence into a region of unknown sequence before performing PCR. Other 
methods which may be used to retrieve unknown sequences are known in the art (Parker, J.D. et al. 
(1991) Nucleic Acids Res. 19:3055-3060). Additionally, one may use PCR, nested primers, and 
PROMOTERF1NDER libraries (Clontech, Palo Alto CA) to walk genomic DNA. This procedure 
avoids the need to screen libraries and is useful in finding intron/exon junctions. For all PCR-based 
methods, primers may be designed using commercially available software, such as OLIGO 4.06 
primer analysis software (National Biosciences, Plymouth MN) or another appropriate program, to be 
about 22 to 30 nucleotides in length, to have a GC content of about 50% or more, and to anneal to the 
template at temperatures of about 68°C to 72°C. 

When screening for full length cDNAs, it is preferable to use libraries that have been 
size-selected to include larger cDNAs. In addition, random-primed libraries, which often include 
sequences containing the 5' regions of genes, are preferable for situations in which an oligo d(T) 
library does not yield a full-length cDNA. Genomic libraries may be useful for extension of sequence 
into 5* non-transcribed regulatory regions. 

Capillary electrophoresis systems which are commercially available may be used to analyze 
the size or confirm the nucleotide sequence of sequencing or PCR products. In particular, capillary 
sequencing may employ flowable polymers for electrophoretic separation, four different nucleotide- 
specific, laser-stimulated fluorescent dyes, and a charge coupled device camera for detection of the 
emitted wavelengths. Output/light intensity may be converted to electrical signal using appropriate 
software (e.g., GENOTYPER and SEQUENCE NAVIGATOR, Applied Biosystems), and the entire 
process from loading of samples to computer analysis and electronic data display maybe computer 
controlled. Capillary electrophoresis is especially preferable for sequencing small DNA fragments 
which may be present in limited amounts in a particular sample. 

In another embodiment of the invention, polynucleotides or fragments thereof which encode 
N AAP may be cloned in recombinant DNA molecules that direct expression of NA AP, or fragments 
or functional equivalents thereof, in appropriate host cells. Due to the inherent degeneracy of the 
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genetic code, other polynucleotides which encode substantially the same or a functionally equivalent 
polypeptides maybe produced and used to express NAAP. 

The polynucleotides of the invention can be engineered using methods generally known in the 
art in order to alter NAAP-encoding sequences for a variety of purposes including, but not limited to, 
5 modification of the cloning, processing, and/or expression of the gene product. DNA shuffling by 

random fragmentation and PCR reassembly of gene fragments and synthetic oligonucleotides may be 
used to engineer the nucleotide sequences. For example, oligonucleotide-mediated site-directed 
mutagenesis maybe used to introduce mutations that create new restriction sites, alter glycosylation 
patterns, change codon preference, produce splice variants, and so forth. 
10 The nucleotides of the present invention may be subjected to DNA shuffling techniques such 

as MODECULARBREEDING (Maxygen Inc.', Santa Clara CA; described in U.S. Patent No. 
5,837,458; Chang, C.-C. et al. (1999) Nat Biotechnol. 17:793-797; Christians, F.C. et aL (1999) Nat. 
Biotechnol. 17:259-264; and Crameri, A. et al. (1996) Nat Biotechnol 14:315-319) to alter or improve 
the biological properties of NAAP, such as its biological or enzymatic activity or its ability to bind to 
15 other molecules or compounds. DNA shuffling is a process by which a library of gene variants is 
produced using PCR-mediated recombination of gene fragments. The library is then subjected to 
selection or screening procedures that identify those gene variants with the desired properties. These 
preferred variants may then be pooled and further subjected to recursive rounds of DNA shuffling and 
selection/screening. Thus, genetic diversity is created through "artificial" breeding and rapid molecular 
20 evolution. For example, fragments of a single gene containing random point mutations may be 

recombined, screened, and then reshuffled until the desired properties are optimized. Alternatively, 
fragments of a given gene may be recombined with fragments of homologous genes in the same gene 
family, either from the same or different species, thereby maximizing the genetic diversity of multiple 
naturally occurring genes in a directed and controllable manner. 
25 In another embodiment, polynucleotides encoding NAAP may be synthesized, in whole or in 

part, using one or more chemical methods well known in the art (Caruthers, M.H. et aL (1980) 
Nucleic Acids Symp. Ser. 7:215-223; and Horn, T. et aL (1980) Nucleic Acids Symp. Ser. 7:225-232). 
Alternatively, NAAP itself or a fragment thereof may be synthesized using chemical methods known 
in the art For example, peptide synthesis can be performed using various solution-phase or 
30 solid-phase techniques (Creighton, T. (1984) Proteins. Structures and Molecular Properties, WH 

Freeman, New York NY, pp. 55-60; Roberge, J.Y. et al. (1995) Science 269:202-204). Automated 
synthesis may be achieved using the ABI 43 1A peptide synthesizer (Applied Biosystems). 
Additionally, the amino acid sequence of NAAP, or any part thereof, may be altered during direct 
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synthesis and/or combined with sequences from other proteins, or any part thereof, to produce a 
variant polypeptide or a polypeptide having a sequence of a naturally occurring polypeptide. 

The peptide may be substantially purified by preparative high performance liquid 
chromatography (Chiez, R.M. and RZ. Regnier (1990) Methods Enzymol. 182:392-421). The 
5 composition of the synthetic peptides may be confirmed by amino acid analysis or by sequencing 
(Creighton, supra, pp. 28-53). 

In order to express a biologically active NAAP, the polynucleotides encoding NAAP or 
derivatives thereof may be inserted into an appropriate expression vector, i.e., a vector which contains 
the necessary elements for transcriptional and translational control of the inserted coding sequence in 
10 a suitable host. These elements include regulatory sequences, such as enhancers, constitutive and 

inducible promoters, and 5* and 3 'untranslated regions in the vector and in polynucleotides encoding 
NAAP. Such elements may vary in their strength and specificity. Specific initiation signals may also 
be used to achieve more efficient translation of polynucleotides encoding NAAP. Such signals include 
the ATG initiation codon and adjacent sequences, e.g. the Kozak sequence. In cases where a 
15 polynucleotide sequence encoding NAAP and its initiation codon and upstream regulatory sequences 
are inserted into the appropriate expression vector, no additional transcriptional or translational control 
signals may be needed. However, in cases where only coding sequence, or a fragment thereof, is 
inserted, exogenous translational control signals including an in-frame ATG initiation codon should be 
provided by the vector. Exogenous translational elements and initiation codons may be of various 
20 ori gins , both natural and synthetic. The efficiency of expression may be enhanced by the inclusion of 
enhancers appropriate for the particular host cell system used (Scharf, D. et al. (1994) Results ProbL 
Cell Differ. 20:125-162.) 

Methods which are well known to those skilled in the art may be used to construct expression 
vectors cont aining polynucleotides encoding NAAP and appropriate transcriptional and translational 
25 control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, 
and in vivo genetic recombination (Sambrook et al., supra, ch. 4, 8, and 16-17; Ausubel, F.M. et al. 
(1995) Current Protocols in Molecular Biology , John Wiley & Sons, New York NY, ch. 9, 13, and 16). 

A variety of expression vector/host systems may be utilized to contain and express 
polynucleotides encoding NAAP. These include, but are not limited to, microorganisms such as 
30 m bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; 
yeast transformed with yeast expression vectors; insect cell systems infected with viral expression 
vectors (e.g., baculovirus); plant cell systems transformed with viral expression vectors (e.g., 
cauliflower mosaic virus, CaMV, or tobacco mosaic virus, TMV) or with bacterial expression vectors 
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(e.g., Ti or pBR322 plasmids); or animal cell systems (Sambrook et al., supra; Ausubel, supra; Van 
Heeke, G. and S.M. Schuster (1989) J. BioL Chem. 264:5503-5509; Engelhard, E.K. et aL (1994) 
Proc. Natl. Acad. Sci. USA 91:3224-3227; Sandig, V. et al. (1996) Hum. Gene Ther. 7:1937-1945; 
Takamatsu, N. (1987) EMBO J. 6:307-311; The McGraw Hill Yearb ook of Science and Technology 
5 (1992) McGraw Hill, New York NY, pp. 191-196; Logan, J. and T. Shenk (1984) Proc. NatL Acad. 
Sci USA 81:3655-3659; and Harrington, JJ. et al. (1997) Nat Genet 15:345-355). Expression 
vectors derived from retroviruses, adenoviruses, or herpes or vaccinia viruses, or from various 
bacterial plasmids, maybe used for delivery of polynucleotides to the targeted organ, tissue, or cell 
population (Di Nicola, M. et al. (1998) Cancer Gen. Ther. 5(6)350-356; Yu, M. et al. (1993) Proc. 
10 Natl. Acad. Sci. USA 90(13):6340-6344; Buller, R.M. et aL (1985) Nature 317(6040):813-815; 

McGregor, D.P. et al (1994) Mol. Immunol. 31(3):219-226; and Verma, I.M. and N. Somia (1997) 
Nature 389:239-242). The invention is not limited by the host cell employed. 

In bacterial systems, a number of cloning and expression vectors may be selected depending 
upon the use intended for polynucleotides encoding NAAP. For example, routine cloning, subcloning, 
15 and propagation of polynucleotides encoding NAAP can be achieved using a multifunctional E. coli 
vector such as PBLUESCRIPT (Stratagene, La Jolla CA) or PSPORT1 plasmid (Invitrogen). 
ligation of polynucleotides encoding NAAP into the vector's multiple cloning site disrupts the lacZ 
gene, allowing a colorimetric screening procedure for identification of transformed bacteria containing 
recombinant molecules. In addition, these vectors may be useful for in vitro transcription, dideoxy 
20 sequencing, single strand rescue with helper phage, and creation of nested deletions in the cloned 
sequence (Van Heeke, G. and S.M. Schuster (1989) J. BioL Ohem. 264:5503-5509). When large 
quantities of NAAP are needed, e.g. for the production of antibodies, vectors which direct high level 
expression of NAAP may be used. For example, vectors containing the strong, inducible SP6 or T7 
bacteriophage promoter may be used. 
25 Yeast expression systems may be used for production of NAAP. A number of vectors 

containing constitutive or inducible promoters, such as alpha factor, alcohol oxidase, and PGH 
promoters, may be used in the yeast Saccharomyces cerevisiae or Pichia pastoms. In addition, such 
vectors direct either the secretion or intracellular retention of expressed proteins and enable integration 
of foreign polynucleotide sequences into the host genome for stable propagation (Ausubel, 1995, 
30 supra; Bitter, G.A. et al. (1987) Methods Enzymol. 153:516-544; and Scorer, C.A. et aL (1994) 
Bio/Technology 12:181-184). 

Plant systems may also be used for expression of NAAP. Transcription of polynucleotides 
encoding NAAP may be driven by viral promoters, e.g., the 35S and 19S promoters of CaMV used 
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alone or in combination with the omega leader sequence from TMV (Takamatsu, N. (1987) EMBO J. 
6:307-3 1 1). Alternatively, plant promoters such as the small subunit of RUBISCO or heat shock 
promoters may be used (Coruzzi, G. et al. (1984) EMBO J. 3:1671-1680; BrogHe, R. et al. (1984) 
Science 224:838-843; and Winter, J. et al. (1991) Results ProbL Cell Differ. 17:85-105). These 
constructs can be introduced into plant cells by direct DNA transformation or pathogen-mediated 
transfection fThe McGraw Hill Yearbook of Science and Technology (1992) McGraw Hill, New York 
NY, pp. 191-196). 

In mammalian cells, a number of viral-based expression systems may be utilized. In cases 
where an adenovirus is used as an expression vector, polynucleotides encoding NA AP may be ligated 
into an adenovirus transcription/translation complex consisting of the late promoter and tripartite leader 
sequence. Insertion in a non-essential El or E3 region of the viral genome may be used to obtain 
infective virus which expresses NAAP in host cells (Logan, J. and T. Shenk (1984) Proc. Natl. Acad. 
Sci. USA 81:3655-3659). In addition, transcription enhancers, such as the Rous sarcoma virus (RSV) 
enhancer, may be used to increase expression in mammalian host cells. SV40 or EBV-based vectors 
may also be used for high-level protein expression. 

Human artificial chromosomes (HACs) may also be employed to deliver larger fragments of 
DNA than can be contained in and expressed from a plasmid. HACs of about 6 kb to 10 Mb are 
constructed and delivered via conventional delivery methods (liposomes, polycationic amino polymers, 
or vesicles) for therapeutic purposes (Harrington, J J. et al (1997) Nat. Genet 15:345-355). 

For long term production of recombinant proteins in mammalian systems, stable expression of 
NAAP in cell lines is preferred. For example, polynucleotides encoding NAAP can be transformed 
into cell lines using expression vectors which may contain viral origins of replication and/or 
endogenous expression elements and a selectable marker gene on the same or on a separate vector. 
Following the introduction of the vector, cells may be allowed to grow for about 1 to 2 days in enriched 
media before being switched to selective media. The purpose of the selectable marker is to confer 
resistance to a selective agent, and its presence allows growth and recovery of cells which 
successfully express the introduced sequences. Resistant clones of stably transformed cells maybe 
propagated using tissue culture techniques appropriate to the cell type. 

Any number of selection systems may be used to recover transformed cell lines. These 
include, but are not limited to, the herpes simplex virus thymidine kinase and adenine 
phosphoribosyltransferase genes, for use in tkr and apr cells, respectively (Wigler, M. et al (1977) 
Cell 11:223-232; Lowy, I. et al. (1980) Cell 22:817-823). Also, antimetabolite, antibiotic, or herbicide 
resistance can be used as the basis for selection. For example, dhfr confers resistance to 
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methotrexate; neo confers resistance to the aminoglycosides neomycin and G-418; and als and pat 
confer resistance to chlorsulfiiron and phosphinotricin acetyltransferase, respectively (Wigler, M. et al. 
(1980) Proc. NatL Acad. Sci. USA 77:3567-3570; Colbere-Garapin, F. et aL (1981) J. MoL BioL 
150:1-14). Additional selectable genes have been described, e.g., trpB and hisD 9 which alter cellular 
5 requirements for metabolites (Hartman, S.C. and R.C. Mulligan (1988) Proc. Natl. Acad. Sci. USA 
85:8047-8051). Visible markers, e.g., anthocyanins, green fluorescent proteins (GFP; Clontech), £ 
glucuronidase and its substrate B-ghicuronide, or luciferase and its substrate luciferin maybe used. 
These markers can be used not only to identify transformants, but also to quantify the amount of 
transient or stable protein expression attributable to a specific vector system (Rhodes, C.A. (1995) 
10 Methods Mol. BioL 55:121-131). 

Although the presence/absence of marker gene expression suggests that the gene of interest 
is also present, the presence and expression of the gene may need to be confirmed. For example, if 
the sequence encoding NAAP is inserted within a marker gene sequence, transformed cells containing 
polynucleotides encoding NAAP can be identified by the absence of marker gene function. 
15 Alternatively, a marker gene can be placed in tandem with a sequence encoding NAAP under the 
control of a single promoter. Expression of the marker gene in response to induction or selection 
usually indicates expression of the tandem gene as well 

In general, host cells that contain the polynucleotide encoding NAAP and that express NAAP 
may be identified by a variety of procedures known to those of skill in the art. These procedures 
20 include, but are not limited to, DNA-DNA or DNA-RNA hybridizations, PCR amplification, and 
protein bioassay or immunoassay techniques which include membrane, solution, or chip based 
technologies for the detection and/or quantification of nucleic acid or protein sequences. 

Immunological methods for detecting and measuring the expression of NAAP using either 
specific polyclonal or monoclonal antibodies are known in the art. Examples of such techniques 
25 include enzyme-linked immunosorbent assays (EIISAs), radioimmunoassays (RIAs), and 

fluorescence activated cell sorting (FACS). A two-site, monoclonal-based immunoassay uti l i zin g 
monoclonal antibodies reactive to two non-interfering epitopes on NAAP is preferred, but a 
competitive binding assay may be employed. These and other assays are well known in the art 
(Hampton, R. et aL (1990) Serological Methods, a Laboratory Manual , APS Press, St. Paul MN, Sect 
30 IV; Coligan, J.E. et aL (1997) Current Protocols in T TnTrmnologv. Greene Pub. Associates and Wiley- 
Interscience, New York NY; and Pound, J.D. (1998) Immunoch emical Protocols . Humana Press, 
Totowa NJ). 

A wide variety of labels and conjugation techniques are known by those skilled in the art and 
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maybe used in various nucleic acid and amino acid assays. Means for producing labeled hybridization 
or PCR probes for detecting sequences related to polynucleotides encoding NAAP include 
oligolabeling, nick translation, end-labeling, or PCR amplification using a labeled nucleotide. 
Alternatively, polynucleotides encoding NAAP, or any fragments thereof, may be cloned into a vector 
5 for the production of an mRNA probe. Such vectors are known in the art, are commercially available, 
and may be used to synthesize RNA probes in vitro by addition of an appropriate RNA polymerase 
such as 17, T3, or SP6 and labeled nucleotides. These procedures may be conducted using a variety 
of commercially available kits, such as those provided by Amersham Biosciences, Promega (Madison 
WI), and US Biochemical. Suitable reporter molecules or labels which may be used for ease of 
10 detection include radionuclides, enzymes, fluorescent, chemiluminescent, or chromogenic agents, as 
well as substrates, cofactors, inhibitors, magnetic particles, and the like. 

Host cells transformed with polynucleotides encoding NAAP may be cultured under 
conditions suitable for the expression and recovery of the protein from cell culture. The protein 
produced by a transformed cell may be secreted or retained intracellularly depending on the sequence 
15 and/or the vector used. As will be understood by those of skill in the art, expression vectors containing 
polynucleotides which encode NAAP maybe designed to contain signal sequences which direct 
secretion of NAAP through a prokaryotic or eukaryotic cell membrane. 

In addition, a host cell strain may be chosen for its ability to modulate expression of the 
inserted polynucleotides or to process the expressed protein in the desired fashion. Such modifications 
20 of the polypeptide include, but are not limited to, acetylation, carboxylation, glycosylation, 

phosphorylation, lipidation, and acylation. Post-translational processing which cleaves a "prepro" or 
"pro" form of the protein may also be used to specify protein targeting, folding, and/or activity. 
Different host cells which have specific cellular machinery and characteristic mechanisms for 
post-translational activities (e.g., CHO, HeLa, MDCK, HEK293, and WI38) are available from the 
25 American Type Culture Collection (ATCC, Manassas VA) and may be chosen to ensure the correct 
modification and processing of the foreign protein. 

In another embodiment of the invention, natural, modified, or recombinant polynucleotides 
encoding NAAP may be ligated to a heterologous sequence resulting in translation of a fusion protein 
in any of the aforementioned host systems. For example, a chimeric NAAP protein containing a 
30 heterologous moiety that can be recognized by a commercially available antibody may facilitate the 
screening of peptide libraries for inhibitors of NAAP activity. Heterologous protein and peptide 
moieties may also facilitate purification of fusion proteins using commercially available affinity 
matrices. Such moieties include, but are not limited to, glutathione S-transferase (GST), maltose 
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binding protein (MBP), thioredoxin (Trx), calmodulin binding peptide (CBP), 6-BGs, FLAG, c-rnyc, and 
hemagglutinin (HA). GST, MBP, Tix, CBP, and 6-His enable purification of their cognate fusion 
proteins on immobilized glutathione, maltose, phenylarsine oxide, calmodulin, and metal-chelate resins, 
respectively. FLAG, c-myc, and hemagglutinin (HA) enable immunoaffinity purification of fusion 
5 proteins using commercially available monoclonal and polyclonal antibodies that specifically recognize 
these epitope tags. A fusion protein may also be engineered to contain a proteolytic cleavage site 
located between the NAAP encoding sequence and the heterologous protein sequence, so that NAAP 
maybe cleaved away from the heterologous moiety following purification. Methods for fusion protein 
expression and purification are discussed in Ausubel (1995, supra, ch. 10). A variety of commercially 
10 available kits may also be used to facilitate expression and purification of fusion proteins. 

In another embodiment, synthesis of radiolabeled NAAP may be achieved in vitro using the 
TNT rabbit reticulocyte lysate or wheat germ extract system (Promega). These systems couple 
transcription and translation of protein-coding sequences operably associated with the T7, T3, or SP6 
promoters. Translation takes place in the presence of a radiolabeled amino acid precursor, for 
15 example, 35 S-methionine. 

NAAP, fragments of NAAP, or variants of NAAP may be used to screen for compounds 
that specifically bind to NAAP. One or more test compounds may be screened tor specific binding to 
NAAP. In various embodiments, 1, 2, 3, 4, 5, 10, 20, 50, 100, or 200 test compounds can be screened 
for specific binding to NAAP. Examples of test compounds can include antibodies, anticalins, 
20 oligonucleotides, proteins (e.g., ligands or receptors), or small molecules. 

In related embodiments, variants of NAAP can be used to screen for binding of test 
compounds, such as antibodies, to NAAP, a variant of NAAP, or a combination of NAAP and/or one 
or more variants NAAP. In an embodiment, a variant of NAAP can be used to screen for 
compounds that bind to a variant of NAAP, but not to NAAP having the exact sequence of a 
25 sequence of SEQ ID NO:l-36. NAAP variants used to perform such screening canhave a range of 
about 50% to about 99% sequence identity to NAAP, with various embodiments having 60%, 70%, 
75%, 80%, 85%, 90%, and 95% sequence identity. 

In an embodiment, a compound identified in a screen for specific binding to NAAP can be 
closely related to the natural ligand of NAAP, e.g., a ligand or fragment thereof, a natural substrate, a 
30 structural or functional mimetic, or a natural binding partner (Coligan, J.E. et al (1991) Current 

Protocols in Immunology l(2):Chapter 5). In another embodiment, the compound thus identified can 
be a natural ligand of a receptor NAAP (Howard, A.D. et aL (2001) Trends Pharmacol. Sci.22:132- 
140; Wise, A. et aL (2002) Drug Discovery Today 7:235-246). 
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In other embodiments, a compound identified in a screen for specific binding to NAAP can be 
closely related to the natural receptor to which NAAP binds, at least a fragment of the receptor, or a 
fragment of the receptor including all or a portion of the ligand binding site or binding pocket For 
example, the compound may be a receptor for NAAP which is capable of propagating a signal, or a 
decoy receptor for NAAP which is not capable of propagating a signal (Ashkenazi, A. and V.M. Divit 
(1999) Curr. Opin. Cell BioL 11:255-260; Mantovani, A. et al (2001) Trends Immunol 22:328-336). 
The compound can be rationally designed using known techniques. Examples of such techniques 
include those used to construct the compound etanercept (ENBKEL; Immunex Corp., Seattle WA), 
which is efficacious for treating rheumatoid arthritis in humans. Etanercept is an engineered p75 
tumor necrosis factor (TNF) receptor dimer linked to the Fc portion of human IgG 2 (Taylor, P.C. et al. 
(2001) Curr. Opin. Immunol. 13:611-616). 

In one embodiment, two or more antibodies having similar or, alternatively, different 
specificities can be screened for specific binding to NAAP, fragments of NAAP, or variants of 
NAAP. The binding specificity of the antibodies thus screened can thereby be selected to identify 
particular fragments or variants of NAAP. In one embodiment, an antibody can be selected such that 
its binding specificity allows for preferential identification of specific fragments or variants of NAAP. 
In another embodiment, an antibody can be selected such that its binding specificity allows for 
preferential diagnosis of a specific disease or condition having increased, decreased, or otherwise 
abnormal production of NAAP. 

In an embodiment, anticalins can be screened for specific binding to NAAP, fragments of 
NAAP, or variants of NAAP. Anticalins are ligand-binding proteins that have been constructed based 
on a lipocalin scaffold (Weiss, G.A. and H.B. Lowman (2000) Chem. BioL 7:R177-R184; Skerra, A. 
(2001) J. Biotechnol. 74:257-275). The protein architecture of lipocalins can include a beta-barrel 
having eight antiparallel beta-strands, which supports four loops at its open end. These loops form the 
natural ligand-binding site of the lipocalins, a site which can be re-engineered in vitro by amino acid 
substitutions to impart novel binding specificities. The amino acid substitutions can be made using 
methods known in the art or described herein, and can include conservative substitutions (e.g., 
substitutions that do not alter binding specificity) or substitutions that modestly, moderately, or 
significantly alter binding specificity. 

In one embodiment, screening for compounds which specifically bind to, stimulate, or inhibit 
NAAP involves producing appropriate cells which express NAAP, either as a secreted protein or on 
the cell membrane. Preferred cells include cells from mammals, yeast, Drosophila, or E. coli. Cells 
expressing NAAP or cell membrane fractions which contain NAAP are then contacted with a test 
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compound and binding, stimulation, or inhibition of activity of either NAAP or the compound is 
analyzed. 

An assay may simply test binding of a test compound to the polypeptide, wherein binding is 
detected by a ftuorophore, radioisotope, enzyme conjugate, or other detectable label. For example, the 
5 assay may comprise the steps of comb ining at least one test compound with NAAP, either in solution 
or affixed to a solid support, and detecting the binding of NAAP to the compound. Alternatively, the 
assay may detect or measure binding of a test compound in the presence of a labeled competitor. 
Additionally, the assay may be carried out using cell-free preparations, chemical libraries, or natural 
product mixtures, and the test compound(s) may be free in solution or affixed to a solid support 
10 An assay can be used to assess the ability of a compound to bind to its natural ligand and/or to 

inhibit the binding of its natural ligand to its natural receptors. Examples of such assays include radio- 
labeling assays such as those described in U.S. Patent No. 5,914,236 and U.S. Patent No. 6,372,724. . 
In a related embodiment, one or more amino acid substitutions can be introduced into a polypeptide 
compound (such as a receptor) to improve or alter its ability to bind to its natural ligands (Matthews, 
15 DJ. and J. A. Wells. (1994) Chem- BioL 1:25-30). In another related embodiment, one or more amino 
acid substitutions can be introduced into a polypeptide compound (such as a ligand) to improve or alter 
its ability to bind to its natural receptors (Cunningham, B.C. and J.A. Wells (1991) Proc. Natl. Acad. 
Sci. USA 88:3407-3411; Lowman, EB. et aL (1991) J. BioL Chem- 266:10982-10988). 

NAAP, fragments of NAAP, or variants of NAAP may be used to screen for compounds 
20 that modulate the activity of NAAP. Such compounds may include agonists, antagonists, or partial or 
inverse agonists. In one embodiment, an assay is performed under conditions permissive for NAAP 
activity, wherein NAAP is combined with at least one test compound, and the activity of NAAP in the 
presence of a test compound is compared with the activity of NAAP in the absence of the test 
compound. A change in the activity of NAAP in the presence of the test compound is indicative of a 
25 compound that modulates the activity of NAAP. Alternatively, a test compound is combined with an 
in vitro or cell-free system comprising NAAP under conditions suitable for NAAP activity, and the 
assay is performed. In either of these assays, a test compound which modulates the activity of 
NAAP may do so indirectly and need not come in direct contact with the test compound. At least one 
and up to a plurality of test compounds may be screened. 
30 In another embodiment, polynucleotides encoding NAAP or their mammalian homologs may 

be "knocked out" in an animal model system using homologous recombination in embryonic stem (ES) 
cells. Such techniques are well known in the art and are useful for the generation of animal models of 
human disease (U.S. Patent No. 5,175,383 and U.S. Patent No. 5,767,337). For example, mouse ES 
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cells, such as the mouse 129/SvJ cell line, are derived from the early mouse embryo and grown in 
culture. The ES cells are transformed with a vector containing the gene of interest disrupted by a 
marker gene, e.g., the neomycin phosphotransferase gene {neo\ Capecchi, M.R. (1989) Science 
244: 1288-1292). The vector integrates into the corresponding region of the host genome by 
homologous recombination. Alternatively, homologous recombination takes place using the Cre-loxP ' 
system to knockout a gene of interest in a tissue- or developmental stage-specific manner (Marfh, J.D. 
(1996) Clin. Invest. 97:1999-2002; Wagner, KU. et al. (1997) Nucleic Acids Res. 25:4323-4330). 
Transformed ES cells are identified and microinjected into mouse cell blastocysts such as those from 
the C57BL/6 mouse strain. The blastocysts are surgically transferred to pseudopregnant dams, and 
the resulting chimeric progeny are genotyped and bred to produce heterozygous or homozygous 
strains. Transgenic animals thus generated maybe tested with potential therapeutic or toxic agents. 

Polynucleotides encoding NAAP may also be manipulated in vitro in ES cells derived from 
human blastocysts. Human ES cells have the potential to differentiate into at least eight separate cell 
lineages including endoderm, mesoderm, and ectodermal cell types. These cell lineages differentiate 
into, for example, neural cells, hematopoietic lineages, and cardiomyocytes (Thomson, J.A. et aL 
(1998) Science 282:1145-1147). 

Polynucleotides encoding NAAP can also be used to create 'Tmockin" humanized animals 
(pigs) or transgenic animals (mice or rats) to model human disease. With knockin technology, a region 
of a polynucleotide encoding NAAP is injected into animal ES cells, and the injected sequence 
integrates into the animal cell genome. Transformed cells are injected into blastulae, and the blastulae 
are implanted as described above. Transgenic progeny or inbred lines are studied and treated with 
potential pharmaceutical agents to obtain information on treatment of a human disease. Alternatively, 
a mammal inbred to overexpress NAAP, e.g., by secreting NAAP in its milk, may also serve as a 
convenient source of that protein (Janne, J. et al. (1998) BiotechnoL Annu. Rev. 4:55-74). 
THERAPEUTICS 

Chemical and structural similarity, e.g., in the context of sequences and motifs, exists between 
regions of NAAP and nucleic acid-associated proteins. In addition, examples of tissues expressing 
NAAP can be found in Table 6 and can also be found in Example XL Therefore, NAAP appears to 
play a role in cell proliferative, neurological, reproductive, developmental, autoimmune/inflammatory, 
and DNA repair disorders, and infections. In the treatment of disorders associated with increased 
NAAP expression or activity, it is desirable to decrease the expression or activity of NAAP. In the 
treatment of disorders associated with decreased NAAP expression or activity, it is desirable to 
increase the expression or activity of NAAP. 
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Therefore, in one embodiment, NAAP or a fragment or derivative thereof may be 
administered to a subject to treat or prevent a disorder associated with decreased expression or 
activity of NAAP. Examples of such disorders include, but are not limited to, a cell proliferative 
disorder such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed 

5 connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia 
vera, psoriasis, primary thrombocythemia, and cancers including adenocarcinoma, leukemia, 
lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, a cancer of the adrenal 
gland, bladder, bone, bone marrow, brain, breast, cervix, gallbladder, ganglia, gastrointestinal tract, 
heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, 

10 spleen, testis, thymus, thyroid, and uterus; a neurological disorder such as progressive supranuclear 

palsy, corticobasal degeneration, familial frontotemporal dementia, epilepsy, ischemic cerebrovascular 
disease, stroke, cerebral neoplasms, Alzheimer's disease, Pick's disease, Huntington's disease, 
dementia, Parkinson's disease and other extrapyramidal disorders, amyotrophic lateral sclerosis and 
other motor neuron disorders, progressive neural muscular atrophy, retinitis pigmentosa, hereditary 

15 ataxias, multiple sclerosis and other demyelinating diseases, bacterial and viral meningitis, brain 

abscess, subdural empyema, epidural abscess, suppurative intracranial thrombophlebitis, myelitis and 
radiculitis, viral central nervous system disease, prion diseases including kuru, Creutzfeldt-jakob 
disease, and Gerstmann-Straussler-Scheinker syndrome, fatal familial insomnia, nutritional and 
metabolic diseases of the nervous system, neurofibromatosis, tuberous sclerosis, cerebefloretinal 

20 hemangioblastomatosis, encephalotrigeminal syndrome, mental retardation and other developmental 
disorder of the central nervous system, cerebral palsy, a neuroskeletal disorder, an autonomic nervous 
system disorder, a cranial nerve disorder, a spinal cord disease, muscular dystrophy and other 
neuromuscular disorder, a peripheral nervous system disorder, dermatomyositis and polymyositis, 
inherited, metabolic, endocrine, and toxic myopathy, myasthenia gravis, periodic paralysis, a mental 

25 disorder including mood, anxiety, and schizophrenic disorder, seasonal affective disorder (SAD), 

akathesia, amnesia, catatonia, diabetic neuropathy, tardive dyskinesia, dystonias, paranoid psychoses, 
postherpetic neuralgia, and Tourette's disorder; a reproductive disorder such as a disorder of prolactin 
production, infertility, including tubal disease, ovulatory defects, and endometriosis, a disruption of the 
estrous cycle, a disruption of the menstrual cycle, polycystic ovary syndrome, ovarian hyperstimulation 

30 syndrome, endometrial and ovarian tumors, uterine fibroids, autoimmune disorders, ectopic 

pregnancies, and teratogenesis; cancer of the breast, fibrocystic breast disease, and galactorrhea; a 
disruption of spermatogenesis, abnormal sperm physiology, cancer of the testis, cancer of the prostate, 
benign prostatic hyperplasia, prostatitis, Peyronie's disease, impotence, carcinoma of the male breast, 
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and gynecomastia; a developmental disorder such as renal tubular acidosis, anemia, Cushing's 
syndrome, achondroplasia dwarfism, Duchenne and Becker muscular dystrophy, epilepsy, gonadal 
dysgenesis, WAGR syndrome (Wilms' tumor, aniridia, genitourinary abnormalities, and mental 
retardation), Smith-Magenis syndrome, myelodysplastic syndrome, hereditary myoepithelial dysplasia, 
5 hereditary keratodennas, hereditary neuropathies such as Charcot-Marie-Tooth disease and 

neurofibromatosis, hypothyroidism, hydrocephalus, seizure disorders such as Syndeuham's chorea and 
cerebral palsy, spina bifida, anencephaly, craniorachischisis, congenital glaucoma, cataract, and 
sensorineural hearing loss; an autoimmune/inflammatory disorder such as acquired immunodeficiency 
syndrome (AIDS), Addison's disease, adult respiratory distress syndrome, allergies, ankylosing 
10 spondylitis, amyloidosis, anemia, asthma, atherosclerosis, autoimmune hemolytic anemia, autoimmune 
thyroiditis, autoimmune polyendocrinopathy-candidiasis-ectodennal dystrophy (APECED), bronchitis, 
cholecystitis, contact dermatitis, Crohn's disease, atopic dermatitis, dennatomyositis, diabetes mellitus, 
emphysema, episodic lymphopenia withlymphocytotoxins, erythroblastosis,fetalis, erythema nodosum, 
atrophic gastritis, glomerulonephritis, Goodpasture's syndrome, gout, Graves' disease, Hashimoto's 
15 thyroiditis, hypereosinophilia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, 

myocardial or pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polymyositis, psoriasis, 
Reiter's syndrome, rheumatoid arthritis, scleroderma, Sjogren's syndrome, systemic anaphylaxis, 
systemic lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative colitis, uveitis, 
Werner syndrome, complications of cancer, hemodialysis, and extracorporeal circulation, viral, 
20 bacterial, fungal, parasitic, protozoal, and helminthic infections, and trauma; an infection, such as those 
caused by a viral agent classified as adenovirus, arenavirus, bunyavirus, calicivirus, coronavirus, 
filovirus, hepadnavirus, herpesvirus, flavivirus, orthomyxovirus, parvovirus, papovavirus, 
paramyxovirus, picornavirus, poxvirus, reovirus, retrovirus, rhabdovirus, or togavirus; an infection 
caused by a bacterial agent classified as pneumococcus, staphylococcus, streptococcus, bacillus, 
25 corynebacterium, Clostridium, meningococcus, gonococcus, listeria, moraxella, kingella, haemophihis, 
legionella, bordetella, gram-negative enterobacterium including shigella, salmonella, or Campylobacter, 
pseudomonas, vibrio, brucella, francisella, yersinia, bartonella, norcardium, actinomyces, 
mycobacterium, spirochaetale, rickettsia, chlamydia, or mycoplasma; an infection caused by a fungal 
agent classified as aspergillus, blastomyces, dermatophytes, cryptococcus, coccidioides, malasezzia, 
30 histoplasma, or other mycosis-causing fungal agent; an infection caused by a parasite classified as 
Plasmodium or malaria-causing, parasitic entamoeba, leishinania, trypanosoma, toxoplasma, 
Pneumocystis carinii, intestinal protozoa such as giardia, trichomonas, tissue nematode such as 
trichinella, intestinal nematode such as ascaris, lymphatic filarial nematode, trematode such as 
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schistosoma, and cestode such as tapeworm; and a DNA repair disorder such as xeroderma 
pigmentosum, Bloom's syndrome, and Werner's syndrome. 

In another embodiment, a vector capable of expressing NAAP or a fragment or derivative 
thereof may be administered to a subject to treat or prevent a disorder associated with decreased 
5 expression or activity of NAAP including, but not limited to, those described above. 

In a further embodiment, a composition comprising a substantially purified NAAP in 
conjunction with a suitable pharmaceutical carrier may be administered to a subject to treat or prevent 
' a disorder associated with decreased expression or activity of NAAP including, but not limited to, 
those provided above. 

10 In still another embodiment, an agonist which modulates the activity of NAAP may be 

administered to a subject to treat or prevent a disorder associated with decreased expression or 

activity of NAAP including, but not limited to, those listed above. 

In a further embodiment, an antagonist of NAAP may be administered to a subject to treat or 

prevent a disorder associated with increased expression or activity of NAAP. Examples of such 
15 disorders include, but are not limited to, those cell proliferative, neurological, reproductive, 

developmental, autoimmune/inflammatory, and DNA repair disorders, and infections, described above. 

In one aspect, an antibody which specifically binds NAAP may be used directly as an antagonist or 

indirectly as a targeting or delivery mechanism for bringing a pharmaceutical agent to cells or tissues 

which express NAAP. 

20 In an additional embodiment, a vector expressing the complement of the polynucleotide 

encoding NAAP may be administered to a subject to treat or prevent a disorder associated with 
increased expression or activity of NAAP including, but not limited to, those described above. 

In other embodiments, any protein, agonist, antagonist, antibody, complementary sequence, or 
vector embodiments maybe administered in combination with other appropriate therapeutic agents. 

25 Selection of the appropriate agents for use in combination therapy may be made by one of ordinary 
s kill in the art, according to conventional pharmaceutical principles. The combination of therapeutic 
agents may act synergistically to effect the treatment or prevention of the various disorders described 
above. Using this approach, one may be able to achieve therapeutic efficacy with lower dosages of 
each agent, thus reducing the potential for adverse side effects. 

30 An antagonist of NAAP may be produced using methods which are generally known in the 

art. In particular, purified NAAP may be used to produce antibodies or to screen libraries of 
pharmaceutical agents to identify those which specifically bind NAAP. Antibodies to NAAP may 
also be generated using methods that are well known in the art Such antibodies may include, but are 
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not limited to, polyclonal, monoclonal, chimeric, and single chain antibodies, Fab fragments, and 
fragments produced by a Fab expression library. Neutralizing antibodies (i.e., those which inhibit 
dimer formation) are generally preferred for therapeutic use. Single chain antibodies (e.g., from 
camels or llamas) may be potent enzyme inhibitors and may have advantages in the design of peptide 

5 mimetics, and in the development of immuno-adsorbents and biosensors (Muyldermans, S. (2001) J. 
Biotechnol. 74:277-302). 

For the production of antibodies, various hosts including goats, rabbits, rats, mice, camels, 
dromedaries, llamas, humans, and others may be immunized by injection with NAAP or with any 
fragment or oligopeptide thereof which has immunogenic properties. Depending on the host species, 

10 various adjuvants may be used to increase immunological response. Such adjuvants include, but are 
not limited to, Freund's, mineral gels such as aluminum hydroxide, and surface active substances such 
as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, KLH, and dinitrophenol. Among 
adjuvants used in humans, BCG (bacilli Calmette-Guerin) and Corynebacteriwn parvum are 
especially preferable. 

15 It is preferred that the oligopeptides, peptides, or fragments used to induce antibodies to 

NAAP have an amino acid sequence consisting of at least about 5 amino acids, and generally will 
consist of at least about 10 amino acids. It is also preferable that these oligopeptides, peptides, or 
fragments are identical to a portion of the amino acid sequence of the natural protein. Short stretches 
of NAAP amino acids may be fused with those of another protein, such as KLH, and antibodies to the 

20 chimeric molecule may be produced. 

Monoclonal antibodies to NAAP may be prepared using any technique which provides for the 
production of antibody molecules by continuous cell lines in culture. These include, but are not limited 
to, the hybridoma technique, the human B-cell hybridoma technique, and the EBV-hybridoma 
technique (Kohler, G. et al (1975) Nature 256:495-497; Kozbor, D. et al. (1985) J. Immunol. Methods 

25 81:31-42; Cote, R.J. et al. (1983) Proc. Natl. Acad. Sci. USA 80:2026-2030; and Cole, S.P. et al 

(1984) Mol. Cell Biol. 62:109-120). 

In addition, techniques developed for the production of "chimeric antibodies," such as the 
splicing of mouse antibody genes to human antibody genes to obtain a molecule with appropriate 
antigen specificity and biological activity, can be used (Morrison, S.L. et al. (1984) Proc. Natl. Acad. 
30 Sci. USA 81:6851-6855; Neuberger, M.S. et al. (1984) Nature 312:604-608; and Takeda, S. et aL 

(1985) Nature 314:452-454). Alternatively, techniques described for the production of single chain 
antibodies maybe adapted, using methods known in the art, to produce NAAP-specific single chain 
antibodies. Antibodies with related specificity, but of distinct idiotypic composition, maybe generated 

74 

BNSDOCID: <WO 03OO0864A2_l_> 



WO 03/000864 



PCT7US02/21179 



by chain shuffling from random combinatorial immunoglobulin libraries (Burton, D.R. (1991) Proc. 

NatL Acad. Sci. USA 88:10134-10137). 

Antibodies may also be produced by inducing in vivo production in the lymphocyte population 

or by screening immunoglobulin libraries or panels of highly specific binding reagents as disclosed in 
5 the literature (Qrlandi, R. et al. (1989) Proc. NatL Acad. Sci. USA 86:3833-3837; Winter, G. et al. 

(1991) Nature 349:293-299). 

Antibody fragments which contain specific binding sites for NAAP may also be generated. 

For example, such fragments include, but are not limited to, F(ab 5 ) 2 fragments produced by pepsin 

digestion of the antibody molecule and Fab fragments generated by reducing the disulfide bridges of 
10 the F(ab 3 )2 fragments. Alternatively, Fab expression libraries may be constructed to allow rapid and 

easy identification of monoclonal Fab fragments with the desired specificity (Huse, W.D. et al. (1989) 

Science 246:1275-1281.) 

Various immunoassays may be used for screening to identify antibodies having the desired 

specificity. Numerous protocols for competitive binding or immunoradiometric assays using either 
15 polyclonal or monoclonal antibodies with established specificities are well known in the art. Such 

immunoassays typically involve the measurement of complex formation between NAAP and its 

specific antibody. A two-site, monoclonal-based immunoassay utilizing monoclonal antibodies reactive 

to two non-interfering NAAP epitopes is generally used, but a competitive binding assay may also be 

employed (Pound, supra). 

20 Various methods such as Scatchard analysis in conjunction with radioimmunoassay techniques 

may be used to assess the affinity of antibodies for NAAP. Affinity is expressed as an association 
constant, Ka, which is defined as the molar concentration of NAAP-antibody complex divided by the 
molar concentrations of free antigen and free antibody under equilibrium conditions. The 
determined for a preparation of polyclonal antibodies, which are heterogeneous in their affinities for 

25 multiple NAAP epitopes, represents the average affinity, or avidity, of the antibodies for NAAP. The 
K A determined for a preparation of monoclonal antibodies, which are monospecific for a particular 
NAAP epitope, represents a true measure of affinity. High-affinity antibody preparations with K, 
ranging from about 10 9 to 10 12 L/mole are preferred for use in immunoassays in which the NAAP- 
antibody complex must withstand rigorous manipulations. Low-affinity antibody preparations with 

30 ranging from about 10 6 to 10 7 L/mole are preferred for use in immunopurification and similar 
procedures which ultimately require dissociation of NAAP, preferably in active form, from the 
antibody (Catty, D. (1988) Antibodies. Volume I: A Practical Approach , ERL Press, Washington DC; 
Liddell, J.E. and A. Cryer (1991) A Practical Guide to Monoclonal Antibodies . John Wiley & Sons, 
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New York NY). 

The titer and avidity of polyclonal antibody preparations maybe further evaluated to determine 
the quality and suitability of such preparations for certain downstream applications. For example, a 
polyclonal antibody preparation containing at least 1-2 mg specific antibody/ml, preferably 5-10 rag 
5 specific antibody/ml, is generally employed in procedures requiring precipitation of NAAP-antibody 
complexes. Procedures for evaluating antibody specificity, titer, and avidity, and guidelines for 
antibody quality and usage in various applications, are generally available (Catty, supra, and Coligan et 
al, supra). 

In another embodiment of the invention, polynucleotides encoding NAAP, or any fragment or 

10 complement thereof, may be used for therapeutic purposes. In one aspect, modifications of gene 
expression can be achieved by designing complementary sequences or antisense molecules (DNA, 
RNA, PNA, or modified oligonucleotides) to the coding or regulatory regions of the gene encoding 
NAAP. Such technology is well known in the art, and antisense oligonucleotides or larger fragments 
can be designed from various locations along the coding or control regions of sequences encoding 

15 NAAP (Agrawal, S., ed. (1996) Antisense Therapeutics , Humana Press Inc., Totawa NJ). 

In therapeutic use, any gene delivery system suitable for introduction of the antisense 
sequences into appropriate target cells can be used. Antisense sequences can be delivered 
intracellularly in the form of an expression plasmid which, upon transcription, produces a sequence 
complementary to at least a portion of the cellular sequence encoding the target protein (Slater, J.E. et 

20 al. (1998) J. Allergy Clin Immunol. 102(3):469-475; and Scanlon, KJ. et aL (1995) 9(13):1288-1296). 
Antisense sequences can also be introduced intracellularly through the use of viral vectors, such as 
retrovirus and adeno-associated virus vectors (Miller, A.D. (1990) Blood 76:271; Ausubel, supra; 
Uckert, W. and W. Walther (1994) Pharmacol. Ther. 63(3):323-347). Other gene delivery 
mechanisms include liposome-derived systems, artificial viral envelopes, and other systems known in 

25 the art (Rossi, JJ. (1995) Br. Med. Bull. 51(l):217-225; Boado, R.J. et al. (1998) J. Pharm. Sci. 
87(11): 1308-13 15; and Morris, M.C. et al. (1997) Nucleic Acids Res. 25(14):2730-2736). 

In another embodiment of the invention, polynucleotides encoding NAAP may be used for 
somatic or germline gene therapy. Gene therapy maybe performed to (i) correct a genetic deficiency 
(e.g., in the cases of severe combined immunodeficiency (SCED)-Xl disease characterized by X- 

30 linked inheritance (Cavazzana-Calvo, M. et al. (2000) Science 288:669-672), severe combined 

immunodeficiency syndrome associated with an inherited adenosine deaminase (ADA) deficiency 
(Blaese, R.M. et al. (1995) Science 270:475-480; Bordignon, C. et al. (1995) Science 270:470-475), 
cystic fibrosis (Zabner, J. et al. (1993) Cell 75:207-216; Crystal, R.G. et al. (1995) Hum. Gene 
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Therapy 6:643-666; Crystal, R.G. et aL (1995) Hum. Gene Therapy 6:667-703), thalassamias, familial 
hypercholesterolemia, and hemophilia resulting from Factor VIII or Factor IX deficiencies (Crystal, 
R.G. (1995) Science 270:404-410; Verma, LM. and N. Somia (1997) Nature 389:239-242)), (ii) 
express a conditionally lethal gene product (e.g., in the case of cancers which result from unregulated 
5 cell proliferation), or (iii) express a protein which affords protection against intracellular parasites (e.g., 
against human retroviruses, such as human immunodeficiency virus (HIV) (Baltimore, D. (1988) 
Nature 335:395-396; Poeschla, E. et aL (1996) Proc. NatL Acad. Sci. USA 93:11395-11399), hepatitis 
B or C virus (HBV, HCV); fungal parasites, such as Candida albicans and Paracoccidioides 
brasiliensis; and protozoan parasites such as Plasmodium falciparum and Trypanosoma cruzi). In 
10 the case where a genetic deficiency in NAAP expression or regulation causes disease, the expression 
of NAAP from an appropriate population of transduced cells may alleviate the clinical manifestations 
caused hy the genetic deficiency. 

In a further embodiment of the invention, diseases or disorders caused by deficiencies in 
NAAP are treated by constructing mammalian expression vectors encoding NAAP and introducing 
15 these vectors by mechanical means into NAAP-deficient cells. Mechanical transfer technologies for 

use with cells in vivo or ex vitro include (i) direct DNA microinjection into individual cells, (ii) ballistic 
gold particle delivery, (iii) liposome-mediated transfection, (iv) receptor-mediated gene transfer, and 
(v) the use of DNA transposons (Morgan, R.A. and W.R Anderson (1993) Annu. Rev.' Biochem. 
62:191-217; Ivies, Z. (1997) Cell 91:501-510; Boulay, J-L. and IL Recipon (1998) Curr. Opin. 
20 Biotechnol. 9:445-450). 

Expression vectors that may be effective for the expression of NAAP include, but are not 
limited to, the PCDNA 3.1, EPITAG, PRCCMV2, PREP, PVAX, PCR2-TOPOTA vectors 
(Invitrogen, Carlsbad CA), PCMV-SCRIPT, PCMy-TAG, PEGSHflPERV (Stratagene, La Jolla CA), 
and PTET-OFF, PTET-ON, PTRE2, PTRE2-LUC, PTK-HYG (Clontech, Palo Alto CA). NAAP 
25 maybe expressed using (i) a constitutively active promoter, (e.g., from cytomegalovirus (CMV), Rous 
sarcoma virus (RSV), SV40 virus, thymidine kinase (TK), or p-actin genes), (ii) an inducible promoter 
(e.g., the tetracycline-regulated promoter (Gossen, M. and EL Bujard (1992) Proc. NatL Acad. Sci. 
USA 89:5547-5551; Gossen, M. et aL (1995) Science 268:1766-1769; Rossi, F.M.V. and H.M. Blau 
(1998) Curr. Opin. BiotechnoL 9:451-456), commercially available in the T-REX plasmid (Invitrogen)); 
30 the ecdysone-inducible promoter (available in the plasmids PVGRXR and PIND; Invitrogen); the 

FK506/rapamycin inducible promoter; or the RU486/mifepristone inducible promoter (Rossi, F.M.V. 
and KM. Blau, supra)), or (iii) a tissue-specific promoter or the native promoter of the endogenous 
gene encoding NAAP from a normal individual. 
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Commercially available liposome transformation lrits (e.g., the PERFECT LIPID 
TRANSFECTION KIT, available from Invitrogen) allow one with ordinary skill in the art to deliver 
polynucleotides to target cells in culture and require minimal effort to optimize experimental 
parameters. In the alternative, transformation is performed using the calcium phosphate method 
5 (Graham, F.L. and AJ. Eb (1973) Virology 52:456-467), or by electroporation (Neumann, E. et aL 

(1982) EMBO J. 1:841-845). The introduction of DNA to primary cells requires modification of these 
standardized mammalian transfection protocols. 

In another embodiment of the invention, diseases or disorders caused by genetic defects with 
respect to NAAP expression are treated by constructing a retrovirus vector consisting of (i) the 
10 polynucleotide encoding NAAP under the control of an independent promoter or the retrovirus long 
terminal repeat (LTR) promoter, (ii) appropriate RNA packaging signals, and (iii) a Rev-responsive 
element (RRE) along with additional retrovirus cis-acting RNA sequences and coding sequences 
required for efficient vector propagation. Retrovirus vectors (e.g., PFB and PEBNEO) are 
commercially available (Stratagene) and are based on published data (Riviere, L et aL (1995) Proc. 
15 Natl. Acad. Sci. USA 92:6733-6737), incorporated by reference herein. The vector is propagated in 
an appropriate vector producing cell line (VPCL) that expresses an envelope gene with a tropism for 
receptors on the target cells or a promiscuous envelope protein such as VSVg (Armentano, D. et al. 
(1987) J. Virol. 61:1647-1650; Bender, M.A. et aL (1987) J. ViroL 61:1639-1646; Adam, M.A. and 
A.D. Miller (1988) J. Virol. 62:3802-3806; Dull, T. et aL (1998) J. Virol. 72:8463-8471; Zufferey, R. et 
20 al. (1998) J. Virol. 72:9873-9880). U.S. Patent No. 5,910,434 to Rigg ('Method for obtaining 

retrovirus packaging cell lines producing high transducing efficiency retroviral supernatant") discloses 
a method for obtaining retrovirus packaging cell lines and is hereby incorporated by reference. 
Propagation of retrovirus vectors, transduction of a population of cells (e.g., CD4 + T-cells), and the 
return of transduced cells to a patient are procedures well known to persons skilled in the art of gene 
25 therapy and have been well documented (Ranga, U. et al. (1997) J. ViroL 7 1:7020-7029; Bauer, G. et 
al. (1997) Blood 89:2259-2267; Bonyhadi, M.L. (1997) J. ViroL 71:4707-4716; Ranga, U. et al. (1998) 
Proc. Natl. Acad. Sci. USA 95:1201-1206; Su, L. (1997) Blood 89:2283-2290). 

In an embodiment, an adenovirus-based gene therapy delivery system is used to deliver 
polynucleotides encoding NAAP to cells which have one or more genetic abnormalities with respect to 
30 the expression of NAAP. The construction and packaging of adenovirus-based vectors are well 

known to those with ordinary skill in the art. Replication defective adenovirus vectors have proven to 
be versatile for importing genes encoding immunoregulatory proteins into intact islets in the pancreas 
(Csete, M.E. et al. (1995) Transplantation 27:263-268). Potentially useful adenoviral vectors are 
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described in U.S. Patent No. 5,707,618 to Armentano ("Adenovirus vectors for gene therapy''), 
hereby incorporated by reference. For adenoviral vectors, see also Antinozzi et al. (1999; Annu. Rev. 
Nutr. 19:511-544) and Verma and Somia (1997; Nature 18:389:239-242), both incorporated by 
reference herein. 

5 In another embodiment, a herpes-based, gene therapy delivery system is used to deliver 

polynucleotides encoding NAAP to target cells which have one or more genetic abnormalities with 
respect to the expression of NAAP. The use of herpes simplex virus (HSV)-based vectors may be 
especially valuable for introducing NAAP to cells of the central nervous system, for which HS V has a 
tropism. The construction and packaging of herpes-based vectors are well known to those with 

10 ordinary skill in the art. A replication-competent herpes simplex virus (HS V) type 1 -based vector has 
been used to deliver a reporter gene to the eyes of primates (liu, X. et al (1999) Exp. Eye Res. 
169:3 85-395). The construction of a HSV-1 virus vector has also been disclosed in detail in U.S. 
Patent No. 5,804,413 to DeLuca ("Herpes simplex virus strains for gene transfer"), which is hereby 
incorporated by reference. U.S. Patent No. 5,804,413 teaches the use of recombinant HSV d92 

15 which consists of a genome containing at least one exogenous gene to be transferred to a cell under 
the control of the appropriate promoter for purposes including human gene therapy. Also taught by 
this patent are the construction and use of recombinant HSV strains deleted for ICP4, ICP27 and 
ICP22. For HSV vectors, see also Goins, et aL (1999; J. Virol. 73:519-532) and Xu et al. (1994; Dev. 
Biol. 163:152-161), hereby incorporated by reference. The manipulation of cloned herpesvirus 

20 sequences, the generation of recombinant virus following the transfection of multiple plasmids 
containing different segments of the large herpesvirus genomes, the growth and propagation of 
herpesvirus, and the infection of cells with herpesvirus are techniques well known to those of ordinary 
skill in the art. 

In another embodiment, an alphavirus (positive, single-stranded RNA virus) vector is used to 
25 deliver polynucleotides encoding NAAP to target cells. The biology of the prototypic alphavirus, 

Semliki Forest Virus (SFV), has been studied extensively and gene transfer vectors have been based 
on the SFV genome (Garoff, H. and K.-J. Li (1998) Curr. Opin. Biotechnol. 9:464-469). During 
alphavirus RNA replication, a subgenomic RNA is generated that normally encodes the viral capsid 
proteins. This subgenomic RNA replicates to higher levels than the foil length genomic RNA, 
30 resulting in the overproduction of capsid proteins relative to the viral proteins with enzymatic activity 
(e.g., protease and polymerase). Similarly, inserting the coding sequence for NAAP into the 
alphavirus genome in place of the capsid-coding region results in the production of a large number of 
NAAP-coding RNAs and the synthesis of high levels of NAAP in vector transduced cells. While 
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alphavirus infection is typically associated with cell lysis within a few days, the ability to establish a 
persistent infection in hamster normal kidney cells (BHK-21) with a variant of Sindbis virus (SIN) 
indicates that the lytic replication of alphaviruses can be altered to suit the needs of the gene therapy 
application (Dryga, S.A. et al. (1997) Virology 228:74-83). The wide host range of alphaviruses will 
allow the introduction of NAAP into a variety of cell types. The specific transduction of a subset of 
cells in a population may require the sorting of cells prior to transduction. The methods of 
manipulating infectious cDNA clones of alphaviruses, performing alphavirus cDNA and RNA 
transfections, and performing alphavirus infections, are well known to those with ordinary skill in the 
art. 

Oligonucleotides derived from the transcription initiation site, e.g., between about positions -10 
and +10 from the start site, may also be employed to inhibit gene expression. Similarly, inhibition can 
be achieved using triple helix base-pairing methodology. Triple helix pairing is useful because it causes 
inhibition of the ability of the double helix to open sufficiently for the binding of polymerases, 
transcription factors, or regulatory molecules. Recent therapeutic advances using triplex DNA have 
been described in the literature (Gee, J.E. et aL (1994) in Huber, B.E. and B.I. Carr, Molecular and 
Immunolo gic Approaches , Futura Publishing, Mt. Kisco NY, pp. 163-177). A complementary 
sequence or antisense molecule may also be designed to block translation of mRNA by preventing the 
transcript from binding to ribosomes. 

Ribozymes, enzymatic RNA molecules, may also be used to catalyze the specific cleavage of 
RNA. The mechanism of ribozyme action involves sequence-specific hybridization of the ribozyme 
molecule to complementary target RNA, followed by endonucleolytic cleavage. For example, 
engineered hammerhead motif ribozyme molecules may specifically and efficiently catalyze 
endonucleolytic cleavage of RNA molecules encoding NAAP. 

Specific ribozyme cleavage sites within any potential RNA target are initially identified by 
scanning the target molecule for ribozyme cleavage sites, including the following sequences: GUA, 
GUU, and GUC. Once identified, short RNA sequences of between 15 and 20 ribonucleotides, 
corresponding to the region of the target gene containing the cleavage site, maybe evaluated for 
secondary structural features which may render the oligonucleotide inoperable. The suitability of 
candidate targets may also be evaluated by testing accessibility to hybridization with complementary 
oligonucleotides using ribonuclease protection assays. 

Complementary ribonucleic acid molecules and ribozymes may be prepared by any method 
known in the art for the synthesis of nucleic acid molecules. These include techniques for chemically 
synthesizing oligonucleotides such as solid phase phosphoramidite chemical synthesis. Alternatively, 
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RNA molecules may be generated by in vitro and in vivo transcription of DNA molecules encoding 
NAAR Such DNA sequences may be incorporated into a wide variety of vectors with suitable RNA 
polymerase promoters such as T7 or SP6. Alternatively, these cDNA constructs that synthesize 
complementary RNA, constitutively or inducibly, can be introduced into cell lines, cells, or tissues. 
5 RNA molecules may be modified to increase intracellular stability and half-life. Possible 

modifications include, but are not limited to, the addition of flanking sequences at the 5' and/or 3' ends 
of the molecule, or the use of phosphorothioate or T O-methyl rather than phosphodiesterase linkages 
within the backbone of the molecule. This concept is inherent in the production of PNAs and can be 
extended in all of these molecules by the inclusion of nontraditional bases such as inosine, queosine, 
10 and wybutosine, as well as acetyl-, methyl-, thio-, and similarly modified forms of adenine, cytidine, 
guanine, thymine, and uridine which are not as easily recognized by endogenous endonucleases. 

An additional embodiment of the invention encompasses a method for screening for a 
compound which is effective in altering expression of a polynucleotide encoding NAAP. Compounds 
which may be effective in altering expression of a specific polynucleotide may include, but are not 
15 limited to, oligonucleotides, antisense oligonucleotides, triple helix-forming oligonucleotides, 
transcription factors and other polypeptide transcriptional regulators, and non-macromolecular 
chemical entities which are capable of interacting with specific polynucleotide sequences. Effective 
compounds may alter polynucleotide expression by acting as either inhibitors or promoters of 
polynucleotide expression. Thus, in the treatment of disorders associated with increased NAAP 
20 expression or activity, a compound which specifically inhibits expression of the polynucleotide 
encoding NAAP maybe therapeutically useful, and in the treatment of disorders associated with 
decreased NAAP expression or activity, a compound which specifically promotes expression of the 
polynucleotide encoding NAAP may be therapeutically useful. 

At least one, and up to a plurality, of test compounds may be screened for effectiveness in 
25 altering expression of a specific polynucleotide. A test compound may be obtained by any method 

commonly known in the art, including chemical modification of a compound known to be effective in 
altering polynucleotide expression; selection from an existing, commercially-available or proprietary 
library of naturaHy-occurring or non-natural chemical compounds; rational design of a compound 
based on chemical and/or structural properties of the target polynucleotide; and selection from a 
30 library of chemical compounds created combinatorially or randomly. A sample comprising, a 

polynucleotide encoding NAAP is exposed to at least one test compound thus obtained. The sample 
may comprise, for example, an intact or permeabilized cell, or an in vitro cell-free or reconstituted 
biochemical system. Alterations in the expression of a polynucleotide encoding NAAP are assayed by 
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any method commonly known in the art Typically, die expression of a specific nucleotide is detected 
by hybridization with a probe having a nucleotide sequence complementary to the sequence of the 
polynucleotide encoding NAAP. The amount of hybridization maybe quantified, thus forming the 
basis for a comparison of the expression of the polynucleotide both with and without exposure to one 

5 or more test compounds. Detection of a change in the expression of a polynucleotide exposed to a 
test compound indicates that the test compound is effective in altering the expression of the 
polynucleotide. A screen for a compound effective in altering expression of a specific polynucleotide 
can be carried out, for example, using a Schizosaccharomyces pombe gene expression system 
(Atkins, D. et al (1999) U.S. Patent No. 5,932,435; Arndt, G.M. et al. (2000) Nucleic Acids Res. 

10 28:E15) or a human cell line such as HeLa cell (Clarke, M.L. et aL (2000) Biochem. Biophys. Res. 
Commun. 268:8-13). A particular embodiment of the present invention involves screening a 
combinatorial library of oligonucleotides (such as deoxyribonucleotides, ribonucleotides, peptide nucleic 
acids, and modified oligonucleotides) for antisense activity against a specific polynucleotide sequence 
(Bruice, T.W. et al. (1997) U.S. Patent No. 5,686,242; Bruice, T.W. et al. (2000) U.S. Patent No. 

15 6,022,691). 

Many methods for introducing vectors into cells or tissues are available and equally suitable 
for use in vivo, in vitro, and ex vivo. For ex vivo therapy, vectors may be introduced into stem cells 
taken from the patient and clonally propagated for autologous transplant back into that same patient 
Delivery by transfection, by liposome injections, or by polycationic amino polymers maybe achieved 
20 using methods which are well known in the art (Goldman, C.K. et al. (1997) Nat Biotechnol. 15:462- 
466). 

Any of the therapeutic methods described above may be applied to any subject in need of 
such therapy, including, for example, mammals such as humans, dogs, cats, cows, horses, rabbits, and 
monkeys. 

25 An additional embodiment of the invention relates to the administration of a composition which 

generally comprises an active ingredient formulated with a pharmaceutically acceptable excipient 
Excipients may include, for example, sugars, starches, celluloses, gums, and proteins. Various 
formulations are commonly known and are thoroughly discussed in the latest edition of Remington's 
Pharmaceutical Sciences (Maack Publishing, Easton PA). Such compositions may consist of NAAP, 

30 antibodies to NAAP, and mimetics, agonists, antagonists, or inhibitors of NAAP. 

The compositions utilized in this invention may be administered by any number of routes 
including, but not limited to, oral, intravenous, intramuscular, intra- arterial, intramedullary, intrathecal, 
intraventricular, pulmonary, transdermal, subcutaneous, intraperitoneal, intranasal, enteral, topical, 
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sublingual, or rectal means. 

Compositions for pulmonary administration may be prepared in liquid or dry powder form. 
These compositions are generally aerosolized immediately prior to inhalation by the patient In the 
case of small molecules (e.g. traditional low molecular weight organic drugs), aerosol delivery of fast- 

5 acting formulations is well-known in the art In the case of macromolecules (e.g. larger peptides and 
proteins), recent developments in the field of pulmonary delivery via the alveolar region of the lung 
have enabled the practical delivery of drugs such as insulin to blood circulation (see, e.g., Patton, J.S. 
et aL, U.S. Patent No. 5,997,848). Pulmonary delivery has the advantage of administration without 
needle injection, and obviates the need for potentially toxic penetration enhancers. 

10 Compositions suitable for use in the invention include compositions wherein the active 

ingredients are contained in an effective amount to achieve the intended purpose. The determination 
of an effective dose is well within the capability of those skilled in the art 

Specialized forms of compositions maybe prepared for direct intracellular delivery of 
macromolecules comprising NAAP or fragments thereof. For example, liposome preparations 

15 containing a cell-impermeable macromolecule may promote cell fusion and intracellular delivery of the 
macromolecule. Alternatively, NAAP or a fragment thereof may be joined to a short cationic N- 
terminal portion from the HIV Tat-1 protein. Fusion proteins thus generated have been found to 
transduce into the cells of all tissues, including the brain, in a mouse model system (Schwarze, S.R. et 
al. (1999) Science 285:1569-1572). 

20 For any compound, the therapeutically effective dose can be estimated initially either in cell 

culture assays, e.g., of neoplastic cells, or in animal models such as mice, rats, rabbits, dogs, monkeys, 
or pigs. An animal model may also be used to determine the appropriate concentration range and 
route of a dminis tration. Such information can then be used to determine useful doses and routes for 
administration in humans. 

25 A therapeutically effective dose refers to that amount of active ingredient, for example 

NAAP or fragments thereof, antibodies of NAAP, and agonists, antagonists or inhibitors of NAAP, 
which ameliorates the symptoms or condition. Therapeutic efficacy and toxicity maybe determined 
by standard pharmaceutical procedures in cell cultures or with experimental animals, such as by 
calculating the ED 50 (the dose therapeutically effective in 50% of the population) or LD 50 (the dose 

30 lethal to 50% of the population) statistics. The dose ratio of toxic to therapeutic effects is the 

therapeutic index, which can be expressed as the LD 5( /ED 50 ratio. Compositions which exhibit large 
therapeutic indices are preferred. The data obtained from cell culture assays and animal studies are 
used to formulate a range of dosage for human use. The dosage contained in such compositions is 
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preferably within a range of circulating concentrations that includes the ED 50 with little or no toxicity. 
The dosage varies within this range depending upon the dosage form employed, the sensitivity of the 
patient, and the route of administration. 

The exact dosage will be determined by the practitioner, in light of factors related to the 
subject requiring treatment. Dosage and administration are adjusted to provide sufficient levels of the 
active moiety or to maintain the desired effect Factors which may be taken into account include the 
severity of the disease state, the general health of the subject, the age, weight, and gender of the 
subject, time and frequency of administration, drug combination(s), reaction sensitivities, and response 
to therapy. Long-acting compositions maybe administered every 3 to 4 days, every week, or 
biweekly depending on the half-life and clearance rate of the particular formulation. 

Normal dosage amounts may vary from about 0.1 //g to 100,000 //g, up to a total dose of 
about 1 gram, depending upon the route of administration. Guidance as to particular dosages and 
methods of delivery is provided in the literature and generally available to practitioners in the art 
Those skilled in the art will employ different formulations for nucleotides than for proteins or their 
inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to particular cells, 
conditions, locations, etc. 
DIAGNOSTICS 

In another embodiment, antibodies which specifically bind NAAP may be used for the 
diagnosis of disorders characterized by expression of NAAP, or in assays to monitor patients being 
treated with NAAP or agonists, antagonists, or inhibitors of NAAP. Antibodies useful for diagnostic 
purposes may be prepared in the same manner as described above for therapeutics. Diagnostic 
assays for NAAP include methods which utilize the antibody and a label to detect NAAP in human 
body fluids or in extracts of cells or tissues. The antibodies may be used with or without modification, 
and may be labeled by covalent or non-covalent attachment of a reporter molecule. A wide variety of 
reporter molecules, several of which are described above, are known in the art and may be used. 

A variety of protocols for measuring NAAP, including ELISAs, RIAs, and FACS, are known 
in the art and provide a basis for diagnosing altered or abnormal levels of NAAP expression. Normal 
or standard values for NAAP expression are established by combining body fluids or cell extracts 
taken from normal mammalian subjects, for example, human subjects, with antibodies to NAAP under 
conditions suitable for complex formation. The amount of standard complex formation may be 
quantitated by various methods, such as photometric means. Quantities of NAAP expressed in 
subject, control, and disease samples from biopsied tissues are compared with the standard values. 
Deviation between standard and subject values establishes the parameters for diagnosing disease. 

84 



03000B64A21 > 



WO 03/000864 



PCT/US02/21179 



Ih another embodiment of the invention, polynucleotides encoding NAAP may he used for 
diagnostic purposes. The polynucleotides which maybe used include oligonucleotides, complementary 
RNA and DNA molecules, and PNAs. The polynucleotides may be used to detect and quantify gene 
expression in biopsied tissues in which expression of NAAP may be correlated with disease. The 
5 diagnostic assay maybe used to determine absence, presence, and excess expression of NAAP, and 
to monitor regulation of NAAP levels during therapeutic intervention. 

In one aspect, hybridization with PCR probes which are capable of detecting polynucleotides, 
including genomic sequences, encoding NAAP or closely related molecules maybe used to identify 
nucleic acid sequences which encode NAAP. The specificity of the probe, whether it is made from a 
10 highly specific region, e.g., the 5' regulatory region, or from a less specific region, e.g., a conserved 
motif, and the stringency of the hybridization or amplification will determine whether the probe 
identifies only naturally occurring sequences encoding NAAP, allelic variants, or related sequences. 

Probes may also be used for the detection of related sequences, and may have at least 50% 
sequence identity to any of the NAAP encoding sequences. The hybridization probes of the subject 
15 invention may be DNA or RNA and may be derived from the sequence of SEQ ID NO:37-72 or from 
genomic sequences including promoters, enhancers, and introns of the NAAP gene. 

Means for producing specific hybridization probes for polynucleotides encoding NAAP include 
the cloning of polynucleotides encoding NAAP or NAAP derivatives into vectors for the production of 
mRNA probes. Such vectors are known in the art, are commercially available, and may be used to 
20 synthesize RNA probes in vitro by means of the addition of the appropriate RNA polymerases and 
the appropriate labeled nucleotides. Hybridization probes may be labeled by a variety of reporter 
groups, for example, by radionuclides such as 32 P or 35 S, or by enzymatic labels, such as alkaline 
phosphatase coupled to the probe via avidin/biotin coupling systems, and the like. 

Polynucleotides encoding NAAP may be used for the diagnosis of disorders associated with 
25 expression of NAAP. Examples of such disorders include, but are not limited to, a cell proliferative 
disorder such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed 
connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia 
vera, psoriasis, primary thrombocythemia, and cancers including adenocarcinoma, leukemia, 
lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, a cancer of the adrenal 
30 gland, bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, gastrointestinal tract, 
heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, 
spleen, testis, thymus, thyroid, and uterus; a neurological disorder such as progressive supranuclear 
palsy, corticobasal degeneration, familial frontotemporal dementia, epilepsy, ischemic cerebrovascular 
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disease, stroke, cerebral neoplasms, Alzheimer's disease, Pick's disease, Huntington's disease, 
dementia, Parkinson's disease and other extrapyramidal disorders, amyotrophic lateral sclerosis and 
other motor neuron disorders, progressive neural muscular atrophy, retinitis pigmentosa, hereditary 
ataxias, multiple sclerosis and other demyelinating diseases, bacterial and viral meningitis, brain 

5 abscess, subdural empyema, epidural abscess, suppurative intracranial thrombophlebitis, myelitis and 
radiculitis, viral central nervous system disease, prion diseases including kuru, Creutzfeldt- Jakob 
disease, and Gerstmann-Straussler-Scheinker syndrome, fatal familial insomnia, nutritional and 
metabolic diseases of the nervous system, neurofibromatosis, tuberous sclerosis, cerebelloretinal 
hemangioblastomatosis, encephalotrigeminal syndrome, mental retardation and other developmental 

10 disorder of the central nervous system, cerebral palsy, a neuroskeletal disorder, an autonomic nervous 
system disorder, a cranial nerve disorder, a spinal cord disease, muscular dystrophy and other 
neuromuscular disorder, a peripheral nervous system disorder, dermatomyositis and polymyositis, 
inherited, metabolic, endocrine, and toxic myopathy, myasthenia gravis, periodic paralysis, a mental 
disorder including mood, anxiety, and schizophrenic disorder, seasonal affective disorder (SAD), 

15 akathesia, amnesia, catatonia, diabetic neuropathy, tardive dyskinesia, dystonias, paranoid psychoses, 
postherpetic neuralgia, and Tourette's disorder; a reproductive disorder such as a disorder of prolactin 
production, infertility, including tubal disease, ovulatory defects, and endometriosis, a disruption of the 
estrous cycle, a disruption of the menstrual cycle, polycystic ovary syndrome, ovarian hyperstimulation 
syndrome, endometrial and ovarian tumors, uterine fibroids, autoimmune disorders, ectopic 

20 pregnancies, and teratogenesis; cancer of the breast, fibrocystic breast disease, and galactorrhea; a 

disruption of spermatogenesis, abnormal sperm physiology, cancer of the testis, cancer of the prostate, 
benign prostatic hyperplasia, prostatitis, Peyronie's disease, impotence, carcinoma of the male breast, 
and gynecomastia; a developmental disorder such as renal tubular acidosis, anemia, Cushing's 
syndrome, achondroplastic dwarfism, Duchenne and Becker muscular dystrophy, epilepsy, gonadal 

25 dysgenesis, WAGR syndrome (Wilms' tumor, aniridia, genitourinary abnormalities, and mental 

retardation), Smith-Magenis syndrome, myelodysplastic syndrome, hereditary mucoepithelial dysplasia, 
hereditary keratodermas, hereditary neuropathies such as Charcot-Marie-Tooth disease and 
neurofibromatosis, hypothyroidism, hydrocephalus, seizure disorders such as Syndenham ! s chorea and 
cerebral palsy, spina bifida, anencephaly, craniorachischisis, congenital glaucoma, cataract, and 

30 sensorineural hearing loss; an autoimmune/inflammatory disorder such as acquired immunodeficiency 
syndrome (AIDS), Addison's disease, adult respiratory distress syndrome, allergies, ankylosing 
spondylitis, amyloidosis, anemia, asthma, atherosclerosis, autoimmune hemolytic anemia, autoimmune 
thyroiditis, autoimmune polyendocrinopathy-candidiasis-ectodermal dystrophy (APECED), bronchitis, 
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cholecystitis, contact dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, 
emphysema, episodic lymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythema nodosum, 
atrophic gastritis, glomerulonephritis, Goodpasture's syndrome, gout, Graves 7 disease, Hashimoto's 
thyroiditis, hypereosinophilia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, 
5 . myocardial or pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polymyositis, psoriasis, 
Reiter's syndrome, rheumatoid arthritis, scleroderma, Sjogren's syndrome, systemic anaphylaxis, 
systemic lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative colitis, uveitis, 
Werner syndrome, complications of cancer, hemodialysis, and extracorporeal circulation, viral, 
bacterial, fungal, parasitic, protozoal, and helminthic infections, and trauma; an infection, such as those 
10 caused by a viral agent classified as adenovirus, arenavirus, bunyavirus, calicivirus, coronavirus, 
filovirus, hepadnavirus, herpesvirus, flavivirus, orthomyxovirus, parvovirus, papovavirus, 
paramyxovirus, picornavirus, poxvirus, reo virus, retrovirus, rhabdovirus, or togavirus; an infection 
caused by a bacterial agent classified as pneumococcus, staphylococcus, streptococcus, bacillus, 
corynebacterium, Clostridium, meningococcus, gbnococcus, listeria, moraxella, kingella, haemophihis, 
15 legionella, bordetella, gram-negative enterobacterium including shigella, salmonella, or Campylobacter, 
pseudomonas, vibrio, brucella, francisella, yersinia, bartonella, norcardium, actinomyces, 
mycobacterium, spirochaetale, rickettsia, chlamydia, or mycoplasma; an infection caused by a fungal 
agent classified as aspergillus, blastomyces, dermatophytes, cryptococcus, coccidioides, malasezzia, 
histoplasma, or other mycosis-causing fungal agent; an infection caused by a parasite classified as 
20 plasmodium or malaria-causing, parasitic entamoeba, leishmania, trypanosoma, toxoplasma, 

Pneumocystis carinii, intestinal protozoa such as giardia, trichomonas, tissue nematode such as 
trichinella, intestinal nematode such as ascaris, lymphatic filarial nematode, trematode such as 
schistosoma, and cestode such as tapeworm; and a DNA repair disorder such as xeroderma 
pigmentosum, Bloom's syndrome, and Werner's syndrome. Polynucleotides encoding NAAP may be 
25 used in Southern or northern analysis, dot blot, or other membrane-based technologies; in PCR 

technologies; in dipstick, pin, and multiformat ELISA-like assays; and in microarrays utilizing fluids or 
tissues from patients to detect altered NAAP expression. Such qualitative or quantitative methods are 
well known in the art 

In a particular aspect, polynucleotides encoding NAAP may be used in assays that detect the 
30 presence of associated disorders, particularly those mentioned above. Polynucleotides complementary 
to sequences encoding NAAP may be labeled by standard methods and added to a fluid or tissue 
sample from a patient under conditions suitable for the formation of hybridization complexes. After a 
suitable incubation period, the sample is washed and the signal is quantified and compared with a 
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standard value. If the amount of signal in the patient sample is significantly altered in comparison to a 
control sample then the presence of altered levels of polynucleotides encoding NAAP in the sample 
indicates the presence of the associated disorder. Such assays may also be used to evaluate the 
efficacy of a particular therapeutic treatment regimen in animal studies, in clinical trials, or to monitor 

5 the treatment of an individual patient. 

In order to provide a basis for the diagnosis of a disorder associated with expression of 
NAAP, a normal or standard profile for expression is established. This may be accomplished by 
combining body fluids or cell extracts taken from normal subjects, either animal or human, with a 
sequence, or a fragment thereof, encoding NAAP, under conditions suitable for hybridization or 

10 amplification. Standard hybridization may be quantified by comparing the values obtained from normal 
subjects with values from an experiment in which a known amount of a substantially purified 
polynucleotide is used. Standard values obtained in this manner may be compared with values 
obtained from samples from patients who are symptomatic for a disorder. Deviation from standard 
values is used to establish the presence of a disorder. 

15 Once the presence of a disorder is established and a treatment protocol is initiated, 

hybridization assays may be repeated on a regular basis to determine if the level of expression in the 
patient begins to approximate that which is observed in the normal subject: The results obtained from 
successive assays may be used to show the efficacy of treatment over a period ranging from several 
days to months. 

20 With respect to cancer, the presence of an abnormal amount of transcript (either under- or 

overexpressed) in biopsied tissue from an individual may indicate a predisposition for the development 
of the disease, or may provide a means for detecting the disease prior to the appearance of actual 
clinical symptoms. A more definitive diagnosis of this type may allow health professionals to employ 
preventative measures or aggressive treatment earlier, thereby preventing the development or further 

25 progression of the cancer. 

Additional diagnostic uses for oligonucleotides designed from the sequences encoding NAAP 
may involve the use of PCR. These oligomers may be chemically synthesized, generated 
enzymatically, or produced in vitro. Oligomers will preferably contain a fragment of a polynucleotide 
encoding NAAP, or a fragment of a polynucleotide complementary to the polynucleotide encoding 

30 NAAP, and will be employed under optimized conditions for identification of a specific gene or 
condition. Oligomers may also be employed under less stringent conditions for detection or 
quantification of closely related DNA or RNA sequences. 

In a particular aspect, oligonucleotide primers derived from polynucleotides encoding NAAP 
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may be used to detect single nucleotide polymorphisms (SNPs). SNPs are substitutions, insertions and 
deletions that are a frequent cause of inherited or acquired genetic disease in humans. Methods of 
SNP detection include, but are not limited to, single-stranded conformation polymorphism (SSCP) and 
fluorescent SSCP (fSSCP) methods. In SSCP, oligonucleotide primers derived from polynucleotides 

5 encoding NAAP are used to amplify DNA using the polymerase chain reaction (PCR). The DNA 
may be derived, for example, from diseased or normal tissue, biopsy samples, bodily fluids, and the 
like. SNPs in the DNA cause differences in the secondary and tertiary structures of PCR products in 
single-stranded form, and these differences are detectable using gel electrophoresis in non-denaturing 
gels. In fSCCP, the oligonucleotide primers are fluorescently labeled, which allows detection of the 
10 amplimers in high-throughput equipment such as DNA sequencing machines. Additionally, sequence 
database analysis methods, termed in silico SNP (isSNP), are capable of identifying polymorphisms by 
comparing the sequence of individual overlapping DNA fragments which assemble into a common 
consensus sequence. These computer-based methods filter out sequence variations due to laboratory 
preparation of DNA and sequencing errors using statistical models and automated analyses of DNA 

15 sequence chromato grams. In the alternative, SNPs may be detected and characterized by mass 

spectrometry using, for example, the high throughput MASSARRAY system (Sequenom, Inc., San 
Diego CA). 

SNPs may be used to study the genetic basis of human disease. For example, at least 16 
common SNPs have been associated with non-insulin-dependent diabetes mellitus. SNPs are also 

20 useful for examining differences in disease outcomes in monogenic disorders, such as cystic fibrosis, 
sickle cell anemia, or chronic granulomatous disease. For example, variants in the manaose-binding 
lectin, MBL2, have been shown to be correlated with deleterious pulmonary outcomes in cystic 
fibrosis. SNPs also have utility in pharmacogenomics, the identification of genetic variants that 
influence a patient's response to a drug, such as life-threatening toxicity. For example, a variation in 

25 N-acetyl transferase is associated with a high incidence of peripheral neuropathy in response to the 

anti-tuberculosis drug isoniazid, while a variation in the core promoter of the ALOX5 gene results in 
diminished clinical response to treatment with an anti-asthma drug that targets the 5-lipoxygenase 
pathway. Analysis of the distribution of SNPs in different populations is useful for investigating 
genetic drift, mutation, recombination, and selection, as well as for tracing the origins of populations 

30 and their migrations (Taylor, J.G. et aL (2001) Trends MoL Med. 7:507-512; Kwok, P.-Y. and Z. Gu 
(1999) MoL Med. Today 5:538-543; Nowotny, P. et aL (2001) Curr. Opin. Neurobiol. 11:637-641). 

Methods which may also be used to quantify the expression of NAAP include radiolabeling or 
biotinylating nucleotides, coamplification of a control nucleic acid, and interpolating results from 
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standard curves (Melby, P.C. et aL (1993) J. Immunol. Methods 159:235-244; Duplaa, C. et aL (1993) 
AnaL Biochem. 212:229-236). The speed of quantitation of multiple samples may be accelerated by 
running the assay in a high-throughput format where the oligomer or polynucleotide of interest is 
presented in various dilutions and a spectrophotometric or colorimetric response gives rapid 
5 quantitation. 

In further embodiments, oligonucleotides or longer fragments derived from any of the 
polynucleotides described herein may be used as elements on a microarray. The microarray can be 
used in transcript imaging techniques which monitor the relative expression levels of large numbers of 
genes simultaneously as described below. The microarray may also be used to identify genetic 

10 variants, mutations, and polymorphisms. This information maybe used to determine gene function, to 
understand the genetic basis of a disorder, to diagnose a disorder, to monitor progression/regression of 
disease as a function of gene expression, and to develop and monitor the activities of therapeutic 
agents in the treatment of disease. In particular, this information may be used to develop a 
pharmacogenomic profile of a patient in order to select the most appropriate and effective treatment 

15 regimen for that patient. For example, therapeutic agents which are highly effective and display the 
fewest side effects may be selected for a patient based on his/her pharmacogenomic profile. 

In another embodiment, NAAP, fragments of NAAP, or antibodies specific for NAAP may 
be used as elements on a microarray. The microarray may be used to monitor or measure protein- 
protein interactions, drag-target interactions, and gene expression profiles, as described above. 

20 A particular embodiment relates to the use of the polynucleotides of the present invention to 

generate a transcript image of a tissue or cell type. A transcript image represents the global pattern of 
gene expression by a particular tissue or cell type. Global gene expression patterns are analyzed by 
quantifying the number of expressed genes and their relative abundance under given conditions and at 
a given time (See Seilhamer et aL, "Comparative Gene Transcript Analysis," U.S. Patent No. 

25 5,840,484, expressly incorporated by reference herein). Thus a transcript image may be generated by 
hybridizing the polynucleotides of the present invention or their complements to the totality of 
transcripts or reverse transcripts of a particular tissue or cell type. In one embodiment, the 
hybridization takes place in high-throughput format, wherein the polynucleotides of the present 
invention or their complements comprise a subset of a plurality of elements on a microarray. The 

30 resultant transcript image would provide a profile of gene activity. 

Transcript images maybe generated using transcripts isolated from tissues, cell lines, biopsies, 
or other biological samples. The transcript image may thus reflect gene expression in vivo y as in the 
case of a tissue or biopsy sample, or in vitro, as in the case of a cell line. 
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Transcript images which profile the expression of the polynucleotides of the present invention 
may also he used in conjunction with in vitro model systems and preclinical evaluation of 
pharmaceuticals, as well as toxicological testing of industrial and naturally-occurring environmental 
compounds. All compounds induce characteristic gene expression patterns, frequently termed 

5 molecular fingerprints or toxicant signatures, which are indicative of mecha ni sms of action and toxicity 
(Nuwaysir, E.F. et aL (1999) Mol. Carcinog. 24:153-159; Steiner, S. and N.L. Anderson (2000) 
Toxicol Lett 112-113:467-471). If a test compound has a signature similar to that of a compound 
with known toxicity, it is likely to share those toxic properties. These fingerprints or signatures are 
most useful and refined when they contain expression information from a large number of genes and 

10 gene families. Ideally, a genome- wide measurement of expression provides the highest quality 

signature. Even genes whose expression is not altered by any tested compounds are important as 
well, as the levels of expression of these genes are used to normalize the rest of the expression data. 
The normalization procedure is useful for comparison of expression data after treatment with different 
compounds. While the assignment of gene function to elements of a toxicant signature aids in 

15 interpretation of toxicity mechanisms, knowledge of gene function is not necessary for the statistical 
matching of signatures which leads to prediction of toxicity (See, for example, Press Release 00-02 
from the National Institute of Environmental Health Sciences, released February 29, 2000, available at 
http://www.rdehs.nih.gov/oc/news/toxchip.htm.) Therefore, it is important and desirable in 
toxicological screening using toxicant signatures to include all expressed gene sequences. 

20 In an embodiment, the toxicity of a test compound can be assessed by treating a biological 

sample containing nucleic acids with the test compound. Nucleic acids that are expressed in the 
treated biological sample are hybridized with one or more probes specific to the polynucleotides of the 
present invention, so that transcript levels corresponding to the polynucleotides of the present invention 
may be quantified. The transcript levels in the treated biological sample are compared with levels in 

25 an untreated biological sample. Differences in the transcript levels between the two samples are 
indicative of a toxic response caused by the test compound in the treated sample. 

Another embodiment relates to the use of the polypeptides disclosed herein to analyze the 
proteome of a tissue or cell type. The term proteome refers to the global pattern of protein expression 
in a particular tissue or cell type. Each protein component of a proteome can be subjected individually 

30 to further analysis. Proteome expression patterns, or profiles, are analyzed by quantifying the number 
of expressed proteins and their relative abundance under given conditions and at a given time. A 
profile of a cell's proteome may thus be generated by separating and analyzing the polypeptides of a 
particular tissue or cell type. In one embodiment, the separation is achieved using two-dimensional gel 
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electrophoresis, in which proteins from a sample are separated by isoelectric focusing in the first 
dimension, and then according to molecular weight by sodium dodecyl sulfate slab gel electrophoresis 
in the second dimension (Steiner and Anderson, supra). The proteins are visualized in the gel as 
discrete and uniquely positioned spots, typically by staining the gel with an agent such as Coomassie 
5 Blue or silver or fluorescent stains. The optical density of each protein spot is generally proportional to 
the level of the protein in the sample. The optical densities of equivalently positioned protein spots 
from different samples, for example, from biological samples either treated or untreated with a test 
compound or therapeutic agent, are compared to identify any changes in protein spot density related to 
the treatment. The proteins in the spots are partially sequenced using, for example, standard methods 
10 employing chemical or enzymatic cleavage followed by mass spectrometry. The identity of the protein 
in a spot may be determined by comparing its partial sequence, preferably of iat least 5 contiguous 
amino acid residues, to the polypeptide sequences of interest. In some cases, further sequence data 
may be obtained for definitive protein identification. 

A proteomic profile may also be generated using antibodies specific for NAAP to quantify the 
15 levels of NAAP expression. In one embodiment, the antibodies are used as elements on a microarray, 
and protein expression levels are quantified by exposing the microarray to the sample and detecting 
the levels of protein bound to each array element (Lueking, A. et al. (1999) Anal. Biochem. 270:103- 
111; Mendoze, L.G. et al. (1999) Biotechniques 27:778-788). Detection maybe performed by a 
variety of methods known in the art, for example, by reacting the proteins in the sample with a thiol- or 
20 amino-reactive fluorescent compound and detecting the amount of fluorescence bound at each array 
element. 

Toxicant signatures at the proteome level are also useful for toxicological screening, and 
should be analyzed in parallel with toxicant signatures at the transcript level There is a poor 
correlation between transcript and protein abundances for some proteins in some tissues (Anderson, 

25 N.L. and J. Seilhamer (1997) Electrophoresis 18:533-537), so proteome toxicant signatures may be 
useful in the analysis of compounds which do not significantly affect the transcript image, but which 
alter the proteomic profile. In addition, the analysis of transcripts in body fluids is difficult, due to rapid 
degradation of rhRNA, so proteomic profiling may be more reliable and informative in such cases. 
In another embodiment, the toxicity of a test compound is assessed by treating a biological 

30 sample containing proteins with the test compound. Proteins that are expressed in the treated 

biological sample are separated so that the amount of each protein can be quantified. The amount of 
each protein is compared to the amount of the corresponding protein in an untreated biological sample. 
A difference in the amount of protein between the two samples is indicative of a toxic response to the 
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test compound in the treated sample. Individual proteins are identified by sequencing the amino acid 
residues of the individual proteins and comparing these partial sequences to the polypeptides of the 
present invention. 

In another embodiment, the toxicity of a test compound is assessed by treating a biological 

5 sample containing proteins with the test compound. Proteins from the biological sample are incubated 
with antibodies specific to the polypeptides of the present invention. The amount of protein recognized 
by the antibodies is quantified. The amount of protein in the treated biological sample is compared 
with the amount in an untreated biological sample. A difference in the amount of protein between the 
two samples is indicative of a toxic response to the test compound in the treated sample. 

10 Microarrays may be prepared, used, and analyzed using methods known in the art (Brennan, 

T.M. et al. (1995) U.S. Patent No. 5,474,796; Schena, M. et aL (1996) Proc. NatL Acad. Sci. USA 
93:10614-10619; Baldeschweiler et aL (1995) PCT application W095/251116; Shalon, D. et aL (1995) 
PCT application WO95/35505; Heller, R.A. et al. (1997) Proc. NatL Acad. Sci. USA 94:2150-2155; 
and Heller, MJ. et aL (1997) U.S. Patent No. 5,605,662): Various types of niicroarrays are well 

15 known and thoroughly described in DNA Microarrays: A Practical Approach , M. Schena, ed. (1999) 
Oxford University Press, London. 

In another embodiment of the invention, nucleic acid sequences encoding N AAP may be used 
to generate hybridization probes useful in mapping the naturally occurring genomic sequence. Either 
coding or noncoding sequences may be used, and in some instances, noncoding sequences may be 

20 preferable over coding sequences. For example, conservation of a coding sequence among members 
of a multi-gene family may potentially cause undesired cross hybridization during chromosomal 
mapping. The sequences may be mapped to a particular chromosome, to a specific region of a 
chromosome, or to artificial chromosome constructions, e.g., human artificial chromosomes (HACs), 
yeast artificial chromosomes (YACs), bacterial artificial chromosomes (BACs), bacterial PI 

25 constructions, or single chromosome cDNA libraries (Harrington, J. J. et al. (1997) Nat Genet. 15:345- 
355; Price, CM. (1993) Blood Rev. 7:127-134; and Trask, BJ. (1991) Trends Genet 7:149-154). 
Once mapped, the nucleic acid sequences may be used to develop genetic linkage maps, for example, 
which correlate the inheritance of a disease state with the inheritance of a particular chromosome 
region or restriction fragment length polymorphism (RFLP) (Lander, E.S. and D. Botstein (1986) 

30 Proc. NatL Acad. Sci. USA 83:7353-7357). 

Fluorescent in situ hybridization (FISH) maybe correlated with other physical and genetic 
map data (Heinz-Ulrich, et al. (1995) in Meyers, supra, pp. 965-968). Examples of genetic map data 
can be found in various scientific journals or at the Online Mendelian Inheritance in Man (OMEM) 
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World Wide Web site. Correlation between the location of the gene encoding NAAP on a physical 
map and a specific disorder, or a predisposition to a specific disorder, may help define the region of 
DNA associated with that disorder and thus may further positional cloning efforts. 

In situ hybridization of chromosomal preparations and physical mapping techniques, such as 
linkage analysis using established chromosomal markers, maybe used for extending genetic maps. 
Often the placement of a gene on the chromosome of another mammalian species, such as mouse, 
may reveal associated markers even if the exact chromosomal locus is not known. This information is 
valuable to investigators searching for disease genes using positional cloning or other gene discovery 
techniques. Once the gene or genes responsible for a disease or syndrome have been crudely 
localized by genetic linkage to a particular genomic region, e.g., ataxia-telangiectasia to llq22-23, any 
sequences mapping to that area may represent associated or regulatory genes for further investigation 
(Gatti, R.A. et al (1988) Nature 336:577-580). The nucleotide sequence of the instant invention may 
also be used to detect differences in the chromosomal location due to translocation, inversion, etc., 
among normal, carrier, or affected individuals. 

In another embodiment of the invention, NAAP, its catalytic or immunogenic fragments, or 
oligopeptides thereof can be used for screening libraries of compounds in any of a variety of drug 
screening techniques. The fragment employed in such screening may be free in solution, affixed to a 
solid support, borne on a cell surface, or located intracellularly. The formation of binding complexes 
between NAAP and the agent being tested may be measured. 

Another technique for drug screening provides for high throughput screening of compounds 
having suitable binding affinity to the protein of interest (Geysen, et al. (1984) PCT application 
WO84/03564). In this method, large numbers of different small test compounds are synthesized on a 
solid substrate. The test compounds are reacted with NAAP, or fragments thereof, and washed. 
Bound NAAP is then detected by methods well known in the art. Purified NAAP can also be coated 
directly onto plates for use in the aforementioned drug screening techniques. Alternatively, 
non-neutralizing antibodies can be used to capture the peptide and immobilize it on a solid support. 

In another embodiment, one may use competitive drug screening assays in which neutralizing 
antibodies capable of binding NAAP specifically compete with a test compound for binding NAAP. 
In this manner, antibodies can be used to detect the presence of any peptide which shares one or more 
antigenic determinants with NAAP. 

In additional embodiments, the nucleotide sequences which encode NAAP may be used in 
any molecular biology techniques that have yet to be developed, provided the new techniques rely on 
properties of nucleotide sequences that are currently known, including, but not limited to, such 
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properties as the triplet genetic code and specific base pair interactions. 

Without further elaboration, it is believed that one skilled in the art can, using the preceding 
description, utilize the present invention to its fullest extent. The following embodiments are, therefore, 
to be construed as merely illustrative, and not limitative of the remainder of the disclosure in any way 
5 whatsoever. 

The disclosures of all patents, applications and publications, mentioned above and below, 
including U.S. Ser. No. 60/301,893, U.S. Ser. No. 60/300,518, U.S. Ser. No. 60/301,787, U.S. Ser. 
No. 60/301,892, U.S. Ser. No. 60/301,792, U.S. Ser. No. 60/303,442, U.S. Ser. No. 60/303,405, and 
U.S. Ser. No. 60/364,438, are expressly incorporated by reference herein. 

10 

EXAMPLES 

I. Construction of cDNA Libraries 

Incyte cDNAs were derived from cDNA libraries described in the LJFESEQ GOLD 
database (Incyte Genomics, Palo Alto CA). Some tissues were homogenized and lysed in guanidinium 

15 isothiocyanate, while others were homogenized and lysed in phenol or in a suitable mixture of 
denaturants, such as TRIZOL (Invitrogen), a monophasic solution of phenol and guanidine 
isothiocyanate. The resulting lysates were centrifuged over CsCl cushions or extracted with 
chloroform. RNA was precipitated from the lysates with either isopropanol or sodium acetate and 
ethanol, or by other routine methods. 

20 Phenol extraction and precipitation of RNA were repeated as necessary to increase RNA 

purity. In some cases, RNA was treated with DNase. For most libraries, poly(A)+ RNA was 
isolated using oligo d(T)-coupled paramagnetic particles (Promega), OLJGOTEX latex particles 
(QIAGEN, Chatsworth CA), or an OLIGOTEX mRNA purification kit (QIAGEN). Alternatively, 
RNA was isolated directly from tissue lysates using other RNA isolation kits, e.g., the 

25 POLY(A)PURE mRNA purification kit (Ambion, Austin IX). 

In some cases, Stratagene was provided with RNA and constructed the corresponding cDNA 
libraries. Otherwise, cDNA was synthesized and cDNA libraries were constructed with the 
UNIZAP vector system (Stratagene) or SUPERSCRIPT plasmid system (Invitrogen), using the 
recommended procedures or similar methods known in the art (Ausubel, 1997, supra, units 5.1-6.6). 

30 Reverse transcription was initiated using oligo d(T) or random primers. Synthetic oligonucleotide 
adapters were ligated to double stranded cDNA, and the cDNA was digested with the appropriate 
restriction enzyme or enzymes. For most libraries, the cDNA was size-selected (300-1000 bp) using 
SEPHACRYL S1000, SEPHAROSE CL2B, or SEPHAROSE CL4B column chromatography 
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(Amersham Biosciences) or preparative agarose gel electrophoresis. cDNAs were ligated into 
compatible restriction enzyme sites of the polylinker of a suitable plasmid, e.g., PBLUESCRIPT 
plasmid (Stratagene), PSPORT1 plasmid (Invitrogen), PCDNA2.1 plasmid (Invitrogen, Carlsbad CA), 
PBK-CMV plasmid (Stratagene), PCR2-TOPOTA plasmid (Invitrogen), PCMV-ICIS plasmid 
(Stratagene), pIGEN (Incyte Genomics, Palo Alto CA), pRARE (Incyte Genomics), or pINCY 
(Incyte Genomics), or derivatives thereof. Recombinant plasmids were transformed into competent £. 
coli cells including XLl-Blue, XLl-BlueMRF, or SOLR from Stratagene or DH5a, DH10B, or 
ElectroMAX DH10B from Invitrogen. 
II. Isolation of cDNA Clones 

Plasmids obtained as described in Example I were recovered from host cells by in vivo 
excision using the UN1ZAP vector system (Stratagene) or by cell lysis. Plasmids were purified using 
at least one of the following: a Magic or WIZARD Minipreps DNA purification system (Promega); an 
AGTC Miniprep purification kit (Edge Biosystems, Gaithersburg MD); and QIAWELL 8 Plasmid, 
QIAWELL 8 Plus Plasmid, QIAWELL 8 Ultra Plasmid purification systems or the RE.A.L. PREP 
96 plasmid purification kit from QIAGEN. Following precipitation, plasmids were resuspended in 0.1 
ml of distilled water and stored, with or without lyophilization, at 4 °C. 

Alternatively, plasmid DNA was amplified from host cell lysates using direct link PCR in a 
high-throughput format (Rao, V.B. (1994) Anal. Biochem. 216:1-14). Host cell lysis and thermal 
cycling steps were carried out in a single reaction mixture. Samples were processed and stored in 
384-well plates, and the concentration of amplified plasmid DNA was quantified fluorometrically using 
PICOGREEN dye (Molecular Probes, Eugene OR) and a ELUOROSKAN II fluorescence scanner 
(Labsystems Oy, Helsinki, Finland). 
Ill- Sequencing and Analysis 

Incyte cDNA recovered in plasmids as described in Example II were sequenced as follows. 
Sequencing reactions were processed using standard methods or high-throughput instrumentation such 
as the ABI CATALYST 800 (Applied Biosystems) thermal cycler or the PTC-200 thermal cycler 
(MJ Research) in conjunction with the HYDRA microdispenser (Robbins Scientific) or the 
MICROLAB 2200 (Hamilton) liquid transfer system. cDNA sequencing reactions were prepared 
using reagents provided by Amersham Biosciences or supplied in ABI sequencing kits such as the 
ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems). 
Electrophoretic separation of cDNA sequencing reactions and detection of labeled polynucleotides 
were carried out using the MEGABACE 1000 DNA sequencing system (Amersham Biosciences); 
the ABI PRISM 373 or 377 sequencing system (Applied Biosystems) in conjunction with standard 
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ABI protocols and base calling software; or other sequence analysis systems known in the art. 
Reading frames within the cDNA sequences were identified using standard methods (reviewed in 
Ausubel, 1997, supra, unit 7.7). Some of the cDNA sequences were selected for extension using the 
techniques disclosed in Example VIE. 
5 The polynucleotide sequences derived from Incyte cDNAs were validated by removing 

vector, linker, and poly(A) sequences and by masking ambiguous bases, using algorithms and 
programs based on BLAST, dynamic programming, and dinucleotide nearest neighbor analysis. The 
Incyte cDNA sequences or translations thereof were then queried against a selection of public 
databases such as the GenBank primate, rodent, mammalian, vertebrate, and eukaryote databases, and 
10 BLOCKS , PRINTS , DOMO, PRODOM; PROTEOME databases with sequences from Homo 

sapiens, Rattus norvegicus, Mus musculus, Caenorhabditis elegans, Saccharomyces cerevisiae, 
SchizosaccJiaronryces pombe, and Candida albicans (Incyte Genomics, Palo Alto CA); hidden 
Markov model (HMM)-based protein family databases such as PFAM, INCY, and TTGRFAM (Haft, 
D.H. et al. (2001) Nucleic Acids Res. 29:41-43); and HMM-based protein domain databases such as 
15 SMART (Schultz et al (1998) Proc. Natl. Acad. Sci. USA 95:5857-5864; Letunic, L et al. (2002) 
Nucleic Acids Res. 30:242-244). (HMM is a probabilistic approach which analyzes consensus 
primary structures of gene families. See, for example, Eddy, S.R. (1996) Curr. Opin. Struct BioL 
6:361-365). The queries were performed using programs based on BLAST, FASTA, BUMPS, and 
HMMER. The Incyte cDNA sequences were assembled to produce full length polynucleotide 
20 sequences. Alternatively, GenBank cDNAs, GenBank ESTs, stitched sequences, stretched 

sequences, or Genscan-predicted coding sequences (see Examples IV and V) were used to extend 
Incyte cDNA assemblages to full length. Assembly was performed using programs based on Phred, 
Phrap, and Consed, and cDNA assemblages were screened for open reading frames using programs 
based on GeneMark, BLAST, and FASTA. The full length polynucleotide sequences were translated 
25 to derive the corresponding full length polypeptide sequences. Alternatively, a polypeptide may begin 
at any of the methionine residues of the full length translated polypeptide. Full length polypeptide 
sequences were subsequently analyzed by querying against databases such as the GenBank protein 
databases (genpept), SwissProt, the PROTEOME databases, BLOCKS, PRINTS, DOMO, 
PRODOM, Prosite, hidden Markov model (HMM)-based protein family databases such as PFAM, 
30 INCY, and TIGRFAM; and HMM-based protein domain databases such as SMART. Full length 
polynucleotide sequences are also analyzed using MACDNASIS PRO software (Hitachi Software 
Engineering, South San Francisco CA) and LASERGENE software (DNASTAR). Polynucleotide 
and polypeptide sequence alignments are generated using default parameters specified by the 
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CLUSTAL algorithm as incorporated into the MEGALIGN multisequence alignment program 
(DNASTAR), which also calculates the percent identity between aligned sequences. 

Table 7 summarizes the tools, programs, and algorithms used for the analysis and assembly of 
Incyte cDNA and full length sequences and provides applicable descriptions, references, and threshold 
5 parameters. The first column of Table 7 shows the tools, programs, and algorithms used, the second 
column provides brief descriptions thereof, the third column presents appropriate references, all of 
which are incorporated by reference herein in their entirety, and the fourth column presents, where 
applicable, the scores, probability values, and other parameters used to evaluate the strength of a 
match between two sequences (the higher the score or the lower the probability value, the greater the 
10 identity between two sequences). 

The programs described above for the assembly and analysis of full length polynucleotide and 
polypeptide sequences were also used to identify polynucleotide sequence fragments from SEQ ID 
NO:37-72. Fragments from about 20 to about 4000 nucleotides which are useful in hybridization and 
amplification technologies are described in Table 4, column 2. 
15 IV- Identification and Editing of Coding Sequences from Genomic DNA 

Putative nucleic acid-associated proteins were initially identified by running the Genscan gene 
identification program against public genomic sequence databases (e.g., gbpri and gbhtg). Genscan is 
a general-purpose gene identification program which analyzes genomic DNA sequences from a 
variety of organisms (Burge, C. and S. Karlin (1997) J. Mol. BioL 268:78-94; Burge, C. and S. Karlin 
20 (1998) Curr. Opin. Struct. Biol. 8:346-354). The program concatenates predicted exons to form an 
assembled cDNA sequence extending from a methionine to a stop codon. The output of Genscan is a 
FASTA database of polynucleotide and polypeptide sequences. The maximum range of sequence for 
Genscan to analyze at once was set to 30 Kb. To determine which of these Genscan predicted cDNA 
sequences encode nucleic acid-associated proteins, the encoded polypeptides were analyzed by 
25 querying against PFAM models for nucleic acid-associated proteins. Potential nucleic acid-associated 
proteins were also identified by homology to Incyte cDNA sequences that had been annotated as 
nucleic acid-associated proteins. These selected Genscan-predicted sequences were then compared 
by BLAST analysis to the genpept and gbpri public databases. Where necessary, the Genscan- 
predicted sequences were then edited by comparison to the top BLAST hit from genpept to correct 
30 errors in the sequence predicted by Genscan, such as extra or omitted exons. BLAST analysis was 
also used to find any Incyte cDNA or public cDNA coverage of the Genscan-predicted sequences, 
thus providing evidence for transcription. When Incyte cDNA coverage was available, this 
information was used to correct or confirm the Genscan predicted sequence. Full length 
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polynucleotide sequences were obtained by assembling Genscan-predicted coding sequences with 
Incyte cDNA sequences and/or public cDNA sequences using the assembly process described in 
Example HL Alternatively, foil length polynucleotide sequences were derived entirely from edited or 
unedited Genscan-predicted coding sequences. 

5 V. Assembly of Genomic Sequence Data with cDNA Sequence Data 
"Stitched" Sequences 

Partial cDNA sequences were extended with exons predicted by the Genscan gene 
identification program described in Example IV. Partial cDNAs assembled as described in Example 
IK were mapped to genomic DNA and parsed into clusters containing related cDNAs and Genscan 

10 exon predictions from one or more genomic sequences. Each cluster was analyzed using an algorithm 
based on graph theory and dynamic programming to integrate cDNA and genomic information, 
generating possible splice variants that were subsequently confirmed, edited, or extended to create a 
full length sequence. Sequence intervals in which the entire length of the interval was present on 
more than one sequence in the cluster were identified, and intervals thus identified were considered to 

15 be equivalent by transitivity. For example, if an interval was present on a cDNA and two genomic 
sequences, then all three intervals were considered to be equivalent. This process allows unrelated 
but consecutive genomic sequences to be brought together, bridged by cDNA sequence. Intervals 
thus identified were then "stitched" together by the stitching algorithm in the order that they appear 
along their parent sequences to generate the longest possible sequence, as well as sequence variants. 

20 Linkages between intervals which proceed along one type of parent sequence (cDNA to cDNA or 
genomic sequence to genomic sequence) were given preference over linkages which change parent 
type (cDNA to genomic sequence). The resultant stitched sequences were translated and compared 
by BLAST analysis to the genpept and gbpri public databases. Incorrect exons predicted by Genscan 
were corrected by comparison to the top BLAST hit from genpept Sequences were further extended 

25 with additional cDNA sequences, or by inspection of genomic DNA, when necessary. 
"Stretched" Sequences 

Partial DNA sequences were extended to full length with an algorithm based on BLAST 
analysis. First, partial cDNAs assembled as described in Example IH were queried against public 
databases such as the GenBank primate, rodent, mammalian, vertebrate, and eukaryote databases 

30 using the BLAST program. The nearest GenBank protein homolog was then compared by BLAST 
analysis to either Incyte cDNA sequences or GenScan exon predicted sequences described in 
Example IV. A chimeric protein was generated by using the resultant high-scoring segment pairs 
(HSPs) to map the translated sequences onto the GenBank protein homolog. Insertions or deletions 
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may occur in the chimeric protein with respect to the original GenBank protein homolog. The 
GenBank protein homolog, the chimeric protein, or both were used as probes to search for homologous 
genomic sequences from the public human genome databases. Partial DNA sequences were 
therefore "stretched" or extended by the addition of homologous genomic sequences. The resultant 
5 stretched sequences were examined to determine whether it contained a complete gene. 
VI. Chromosomal Mapping of NAAP Encoding Polynucleotides 

The sequences which were used to assemble SEQ ID NO:37-72 were compared with 
sequences from the Incyte LIFESEQ database and public domain databases using BLAST and other 
implementations of the Smith- Waterman algorithm. Sequences from these databases that matched 
10 SEQ ID NO:37-72 were assembled into clusters of contiguous and overlapping sequences using 

assembly algorithms such as Phrap (Table 7). Radiation hybrid and genetic mapping data available 
from public resources such as the Stanford Human Genome Center (SHGC), Whitehead Institute for 
Genome Research (WIGR), and Genethon were used to determine if any of the clustered sequences 
had been previously mapped. Inclusion of a mapped sequence in a cluster resulted in the assignment 
15 of all sequences of that cluster, including its particular SEQ ID NO:, to that map location. 

Map locations are represented by ranges, or intervals, of human chromosomes. The map 
position of an interval, in centiMorgans, is measured relative to the terminus of the chromosome's p- 
arm. (The centiMorgan (cM) is a unit of measurement based on recombination frequencies between 
chromosomal markers. On average, 1 cM is roughly equivalent to 1 megabase (Mb) of DNA in 
20 humans, although this can vary widely due to hot and cold spots of recombination) The cM distances 
are based on genetic markers mapped by Genethon which provide boundaries for radiation hybrid 
markers whose sequences were included in each of the clusters. Human genome maps and other 
resources available to the public, such as the NCBI "GeneMap'99" World Wide Web site 
(http://www.ncbi.nlm.nih.gov/genemap/), can be employed to determine if previously identified disease 
25 genes map within or in proximity to the intervals indicated above. 
VII. Analysis of Polynucleotide Expression 

Northern analysis is a laboratory technique used to detect the presence of a transcript of a 
gene and involves the hybridization of a labeled nucleotide sequence to a membrane on which RNAs 
from a particular cell type or tissue have been bound (Sambrook et al., supra, ch. 7; Ausubel (1995) 
30 supra, ch. 4 and 16). 

Analogous computer techniques applying BLAST were used to search for identical or related 
molecules in cDNA databases such as GenBank or LIFESEQ (Incyte Genomics). This analysis is 
much faster than multiple membrane-based hybridizations. In addition, the sensitivity of the computer 
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search can be modified to determine whether any particular match is categorized as exact or similar. 
The basis of the search is the product score, which is defined as: 

BLAST Score x Percent Identity 

5 5 x minimum {length(Seq. 1), length(Seq. 2)} 

The product score takes into account both the degree of similarity between two sequences and the 
length of the sequence match. The product score is a normalized value between 0 and 100, and is 
calculated as follows: the BLAST score is multiplied by the percent nucleotide identity and the 

10 product is divided by (5 times the length of the shorter of the two sequences). The BLAST score is 
calculated by assi gning a score of +5 for every base that matches in a high-scoring segment pair 
(HSP), and -4 for every mismatch. Two sequences may share more than one HSP (separated by 
gaps). If there is more than one HSP, then the pair with the highest BLAST score is used to calculate 
the product score. The product score represents a balance between fractional overlap and quality in a 

15 BLAST alignment For example, a product score of 100 is produced only for 100% identity over the 
entire length of the shorter of the two sequences being compared. A product score of 70 is produced 
either by 100% identity and 70% overlap at one end, or by 88% identity and 100% overlap at the 
other. A product score of 50 is produced either by 100% identity and 50% overlap at one end, or 79% 
identity and 100% overlap. 

20 Alternatively, polynucleotides encoding NAAP are analyzed with respect to the tissue sources 

from which they were derived. For example, some fiill length sequences are assembled, at least in 
part, with overlapping Incyte cDNA sequences (see Example HI). Each cDNA sequence is derived 
from a cDNA library constructed from a human tissue. Each human tissue is classified into one of the 
following organ/tissue categories: cardiovascular system; connective tissue; digestive system; 

25 embryonic structures; endocrine system; exocrine glands; genitalia, female; genitalia, male; germ cells; 
hemic and immune system; liver; musculoskeletal system; nervous system; pancreas; respiratory 
system; sense organs; skin; s tomato gnathic system; unclassified/mixed; or urinary tract The number 
of libraries in each category is counted and divided by the total number of libraries across all 
categories. S imil arly, each human tissue is classified into one of the following disease/condition 

30 categories: cancer, cell line, developmental, inflammation, neurological, trauma, cardiovascular, pooled, 
and other, and the number of libraries in each category is counted and divided by the total number of 
libraries across all categories. The resulting percentages reflect the tissue- and disease-specific 
expression of cDNA encoding NAAP. 
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VOL Extension of NAAP Encoding Polynucleotides 

Full length polynucleotides are produced by extension of an appropriate fragment of the full 
length molecule using oligonucleotide primers designed from this fragment One primer was 
synthesized to initiate 5 ' extension of the known fragment, and the other primer was synthesized to 
5 initiate 3' extension of the known fragment The initial primers were designed using OLIGO 4.06 

software (National Biosciences), or another appropriate program, to be about 22 to 30 nucleotides in 
length, to have a GC content of about 50% or more, and to anneal to the target sequence at 
temperatures of about 68 °C to about 72 °C. Any stretch of nucleotides which would result in hairpin 
structures and primer-primer dimerizations was avoided. 
10 Selected human cDNA libraries were used to extend the sequence. If more than one 

extension was necessary or desired, additional or nested sets of primers were designed. 

High fidelity amplification was obtained by PCR using methods well known in the art PCR 
was performed in 96-well plates using the PTC-200 thermal cycler (MJ Research, Inc.). The reaction 
mix contained DNA template, 200 nmol of each primer, reaction buffer containing Mg 2 " 1 ", (NH^SCXj, 
15 and 2-mercaptoethanol, Taq DNA polymerase (Amersham Biosciences), ELONGASE enzyme 

(Invitrogen), and Pfu DNA polymerase (Stratagene), with the following parameters for primer pair 
PCI A and PCIB: Step 1: 94 °C, 3 min; Step 2; 94°C, 15 sec; Step 3: 60°C, 1 min; Step 4: 68°C, 2 
min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68°C, 5 min; Step 7: storage at 4°C. In the 
alternative, the parameters for primer pair 17 and SK+ were as follows: Step 1: 94°C, 3 min; Step 2: 
20 94°C, 15 sec; Step 3: 57 °C, 1 min; Step 4: 68°C, 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; 
Step 6: 68 °C, 5 min; Step 7: storage at 4°C. 

The concentration of DNA in each well was determined by dispensing 100 /xl PICOGREEN 
quantitation reagent (0.25% (v/v) PICOGREEN; Molecular Probes, Eugene OR) dissolved in IX TE 
and 0.5 fil of undiluted PCR product into each well of an opaque fluorimeter plate (Corning Costar, 
25 Acton MA), allowing the DNA to bind to the reagent The plate was scanned in a Fhioroskan II 
(Labsystems Oy, Helsinki, Finland) to measure the fluorescence of the sample and to quantify the 
concentration of DNA. A 5 lA to 10 \A aliquot of the reaction mixture was analyzed by 
electrophoresis on a 1 % agarose gel to determine which reactions were successful in extending the 
sequence. 

30 The extended nucleotides were desalted and concentrated, transferred to 384-well plates, 

digested with CviJI cholera virus endonuclease (Molecular Biology Research, Madison WI), and 
sonicated or sheared prior to religation into pUC 18 vector (Amersham Biosciences). For shotgun 
sequencing, the digested nucleotides were separated on low concentration (0.6 to 0.8%) agarose gels, 
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fragments were excised, and agar digested with Agar ACE (Promega). Extended clones were 
religated using T4 ligase (New England Biolabs, Beverly MA) into pUC 18 vector (Amersham 
Biosciences), treated with Pfu DNA polymerase (Stratagene) to fill-in restriction site overhangs, and 
transfected into competent E. coli cells. Transformed cells were selected on antibiotic-containing 
media, and individual colonies were picked and cultured overnight at 37 °C in 384-well plates in LB/2x 
carb liquid media. 

The cells were lysed, and DNA was amplified by PCR using Taq DNA polymerase 
(Amersham Biosciences) and Pfu DNA polymerase (Stratagene) with the following parameters: Step 
1: 94°C, 3 min; Step 2: 94°C, 15 sec; Step 3: 60°C, 1 min; Step 4: 72°C, 2 min; Step 5: steps 2, 3, and 
4 repeated 29 times; Step 6: 72 °C, 5 min; Step 7: storage at 4°C. DNA was quantified by 
PICOGREEN reagent (Molecular Probes) as described above. Samples with low DNA recoveries 
were reamplified using the same conditions as described above. Samples were diluted with 20% 
dimethysulfoxide (1:2, v/v), and sequenced using DYENAMIC energy transfer sequencing primers 
and the DYENAMIC DIRECT kit (Amersham Biosciences) or the ABI PRISM BIGDYE 
Terminator cycle sequencing ready reaction kit (Applied Biosystems). 

In like manner, full length polynucleotides are verified using the above procedure or are used 
to obtain 5' regulatory sequences using the above procedure along with oligonucleotides designed for 
such extension, and an appropriate genomic library. 

IX. Identification of Single Nucleotide Polymorphisms in NAAP Encoding 
Polynucleotides 

Common DNA sequence variants known as single nucleotide polymorphisms (SNPs) were 
identified in SEQ ID NO:37-72 using the UFESEQ database (Incyte Genomics). Sequences from the 
same gene were clustered together and assembled as described in Example HI, allowing the 
identification of all sequence variants in the gene. An algorithm consisting of a series of filters was 
used to distinguish SNPs from other sequence variants. Preliminary filters removed the majority of 
basecall errors by requiring a minimum Phred quality score of 15, and removed sequence alignment 
errors and errors resulting from improper trimming of vector sequences, chimeras, and splice variants. 
An automated procedure of advanced chromosome analysis analysed the original chromatogram files 
in the vicinity of the putative SNP. Clone error filters used statistically generated algorithms to identify 
errors introduced during laboratory processing, such as those caused by reverse transcriptase, 
polymerase, or somatic mutation. Clustering error filters used statistically generated algorithms to 
identify errors resulting from clustering of close homologs or pseudogenes, or due to contamination by 
non-human sequences. A final set of filters removed duplicates and SNPs found in immunoglobulins 
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or T-ceH receptors. 

Certain SNPs were selected for further characterization by mass spectrometry using the high 
throughput MASS ARRAY system (Sequenom, Inc.) to analyze allele frequencies at the SNP sites in 
four different human populations. The Caucasian population comprised 92 individuals (46 male, 46 
5 female), including 83 from Utah, four French, three Venezualan, and two Aroish individuals. The 
African population comprised 194 individuals (97 male, 97 female), all African Americans. The 
Hispanic population comprised 324 individuals (162 male, 162 female), all Mexican Hispanic. The 
Asian population comprised 126 individuals (64 male, 62 female) with a reported parental breakdown 
of 43% Chinese, 31% Japanese, 13% Korean, 5% Vietnamese, and 8% other Asian. Allele 
10 frequencies were first analyzed in the Caucasian population; in some cases those SNPs which showed 
no allelic variance in this population were not further tested in the other three populations. 
X. Labeling and Use of Individual Hybridization Probes 

Hybridization probes derived from SEQ ID NO:37-72 are employed to screen cDNAs, 
genomic DNAs, or mRNAs. Although the labeling of oligonucleotides, consisting of about 20 base 
15 pairs, is specifically described, essentially the same procedure is used with larger nucleotide 

fragments. Oligonucleotides are designed using state-of-the-art software such as OLIGO 4.06 
software (National Biosciences) and labeled by combining 50 pmol of each oligomer, 250 juCi of 
[y- 32 P] adenosine triphosphate (Amersham Biosciences), and T4 polynucleotide kinase (DuPont NEN, 
Boston MA). The labeled oligonucleotides are substantially purified using a SEPHADEX G-25 
20 superfine size exclusion dextran bead column (Amersham Biosciences). An aliquot containing 10 7 
counts per minute of the labeled probe is used in a typical membrane-based hybridization analysis of 
human genomic DNA digested with one of the following endonucleases: Ase I, Bgl II, Eco RI, Pst I, 
Xba I, or Pvu H (DuPont NEN). 

The DNA from each digest is fractionated on a 0.7% agarose gel and transferred to nylon 
25 membranes (Nytran Plus, Schleicher & Schuell, Durham NH). Hybridization is carried out for 16 
hours at 40°C. To remove nonspecific signals, blots are sequentially washed at room temperature 
under conditions of up to, for example, 0.1 x saline sodium citrate and 0.5% sodium dodecyl sulfate. 
Hybridization patterns are visualized using autoradiography or an alternative imaging means and 
compared. 
30 XI. Microarrays 

The linkage or synthesis of array elements upon a microarray can be achieved utilizing 
photolithography, piezoelectric printing (ink-jet printing; see, e.g., Baldeschweiler, sup?~a), mechanical 
microspotting technologies, and derivatives thereof. The substrate in each of the aforementioned 
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technologies should be uniform and solid with a non-porous surface (Schena (1999), supra). 
Suggested substrates include silicon, silica, glass slides, glass chips, and silicon wafers. Alternatively, a 
procedure analogous to a dot or slot blot may also be used to arrange and link elements to the surface 
of a substrate using thermal, UV, chemical, or mechanical bonding procedures. A typical array may 
5 be produced using available methods and machines well known to those of ordinary skill in the art and 
may contain any appropriate number of elements (Schena, M. et aL (1995) Science 270:467-470; 
Shalon, D. et aL (1996) Genome Res. 6:639-645; Marshall, A. and J. Hodgson (1998) Nat. Biotechnol. 
16:27-31). 

Full length cDNAs, Expressed Sequence Tags (ESTs), or fragments or oligomers thereof may 
10 comprise the elements of the microarray. Fragments or oligomers suitable for hybridization can be 
selected using software well known in the art such as LASERGENE software (DNASTAR). The 
array elements are hybridized with polynucleotides in a biological sample. The polynucleotides in the 
biological sample are conjugated to a fluorescent label or other molecular tag for ease of detection. 
After hybridization, nonhybridized nucleotides from the biological sample are removed, and a 
15 fluorescence scanner is used to detect hybridization at each array element. Alternatively, laser 
desorbtion and mass spectrometry may be used for detection of hybridization. The degree of 
complementarity and the relative abundance of each polynucleotide which hybridizes to an element on 
the microarray may be assessed. In one embodiment, microarray preparation and usage is described 
in detail below. 
20 Tissue or Cell Sample Preparation 

Total RNA is isolated from tissue samples using the guanidinium thiocyanate method and 
poly(A) + RNA is purified using the oligo-(dT) cellulose method. Each poly(A) + RNA sample is 
reverse transcribed using MMLV reverse-transcriptase, 0.05 pg//il oligo-(dT) primer (21mer), IX first 
strand buffer, 0.03 units//xl RNase inhibitor, 500 /xM dATP, 500 /xM dGTP, 500 fiM dTTP, 40 fiM 
25 dCTP, 40 fiM dCTP-Cy3 (BDS) or dCTP-Cy5 (Amersham Biosciences). The reverse transcription 
reaction is performed in a 25 ml volume containing 200 ng poly(A) + RNA with GEMBRIGHT kits 
(Incyte). Specific control poly(A) + RNAs are synthesized by in vitro transcription from non-coding 
yeast genomic DNA. After incubation at 37° C for 2 hr, each reaction sample (one with Cy3 and 
another with Cy5 labeling) is treated with 2.5 ml of 0.5M sodium hydroxide and incubated for 20 
30 minutes at 85° C to the stop the reaction and degrade the RNA. Samples are purified using two 
successive CHROMA SPIN 30 gel filtration spin columns (CLONTECH Laboratories, Inc. 
(CLONTECH), Palo Alto CA) and after combining, both reaction samples are ethanol precipitated 
using 1 ml of glycogen (1 mg/ml), 60 ml sodium acetate, and 300 ml of 100% ethanol. The sample is 
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then dried to completion using a SpeedVAC (Savant Instruments Inc., Holbrook NY) and resuspended 
in 14 /xl 5X SSC/0.2% SDS. 
Microarrav Preparation 

Sequences of the present invention are used to generate array elements. Each array element 
5 is amplified from bacterial cells containing vectors with cloned cDNA inserts. PCR amplification uses 
primers complementary to the vector sequences flanking the cDNA insert. Array elements are 
amplified in thirty cycles of PCR from an initial quantity of 1-2 ng to a final quantity greater than 5 ^g. 
Amplified array elements are then purified using SEPHACRYL-400 (Amersham Biosciences). 

Purified array elements are immobilized on polymer-coated glass slides. Glass microscope 
10 slides (Corning) are cleaned by ultrasound in 0.1% SDS and acetone, with extensive distilled water 
washes between and after treatments. Glass slides are etched in 4% hydrofluoric acid (VWR 
Scientific Products Corporation (VWR), West Chester PA), washed extensively in distilled water, and 
coated with 0.05% aminopropyl silane (Sigma) in 95% ethanoL Coated slides are cured in a 110°C 
oven. 

15 Array elements are applied to the coated glass substrate using a procedure described in U.S. 

Patent No. 5,807,522, incorporated herein by reference. 1 fil of the array element DNA, at an average 
concentration of 100 ng//xl, is loaded into the open capillary printing element by a high-speed robotic 
apparatus. The apparatus then deposits about 5 nl of array element sample per slide. 

Microarrays are UV-crosslinked using a STRATALINKER UV-crossliriker (Stratagene). 

20 Microarrays are washed at room temperature once in 0.2% SDS and three times in distilled water. 
Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate 
buffered saline (PBS) (Tropix, Inc., Bedford MA) for 30 minutes at 60° C followed by washes in 0.2% 
SDS and distilled water as before. 
Hybridization 

25 Hybridization reactions contain 9 fil of sample mixture consisting of 0.2 fig each of Cy3 and 

Cy5 labeled cDNA synthesis products in 5X SSC, 0.2% SDS hybridization buffer. The sample 
mixture is heated to 65° C for 5 minutes and is aliquoted onto the rnicroarray surface and covered with 
an 1.8 cm 2 coverslip. The arrays are transferred to a waterproof chamber having a cavity just slightly 
larger than a microscope slide. The chamber is kept at 100% humidity internally by the addition of 140 

30 fil of 5X SSC in a corner of the chamber. The chamber containing the arrays is incubated for about 
6.5 hours at 60° C. The arrays are washed for 10 min at 45° C in a first wash buffer (IX SSC, 0.1% 
SDS), three times for 10 minutes each at 45°C in a second wash buffer (0.1X SSC), and dried. 
Detection 
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Reporter-labeled hybridization complexes are detected with a microscope equipped with an 
Innova 70 mixed gas 10 W laser (Coherent, Inc., Santa Clara CA) capable of generating spectral lines 
at 488 nm for excitation of Cy3 and at 632 nm for excitation of Cy5. The excitation laser light is 
focused on the array using a 20X microscope objective (Nikon, Inc., Melville NY). The slide 
5 containing the array is placed on a computer-controlled X-Y stage on the microscope and raster- 
scanned past the objective. The 1.8 cm x 1.8 cm array used in the present example is scanned with a 
resolution of 20 micrometers. 

In two separate scans, a mixed gas multiline laser excites the two fhiorophores sequentially. 
Emitted light is split, based on wavelength, into two photomultiplier tube detectors (PMT R1477, 
10 Hamamatsu Photonics Systems, Bridgewater NJ) corresponding to the two fhiorophores. Appropriate 
filters positioned between the array and the photomultiplier tubes are used to filter the signals. The 
emission maxima of the fhiorophores used are 565 nm for Cy3 and 650 nm for Cy5. Each array is 
typically scanned twice, one scan per fluorophore using the appropriate filters at the laser source, 
although the apparatus is capable of recording the spectra from both fhiorophores simultaneously. 
15 The sensitivity of the scans is typically calibrated using the signal intensity generated by a 

cDNA control species added to the sample mixture at a known concentration. A specific location on 
the array contains a complementary DNA sequence, allowing the intensity of the signal at that location 
to be correlated with a weight ratio of hybridizing species of 1:100,000. When two samples from 
different sources (e.g.., representing test and control cells), each labeled with a different fluorophore, 
20 are hybridized to a single array for the purpose of identifying genes that are differentially expressed, 
the calibration is done by labeling samples of the calibrating cDNA with the two fhiorophores and 
adding identical amounts of each to the hybridization mixture. 

The output of the photomultiplier tube is digitized using a 12-bit RTT-835H analog-to-digital 
(A/D) conversion board (Analog Devices, Inc., Norwood MA) installed in an IBM-compatible PC 
25 computer. The digitized data are displayed as an image where the signal intensity is mapped using a 
linear 20-color transformation to a pseudocolor scale ranging from blue (low signal) to red (high 
signal). The data is also analyzed quantitatively. Where two different fhiorophores are excited and 
measured simultaneously, the data are first corrected for optical crosstalk (due to overlapping emission 
spectra) between the fhiorophores using each fluorophore's emission spectrum. 
30 A grid is superimposed over the fluorescence signal image such that the signal from each spot 

is centered in each element of the grid. The fluorescence signal within each element is then integrated 
to obtain a numerical value corresponding to the average intensity of the signal. The software used 
for signal analysis is the GEMTOOLS gene expression analysis program (Incyte). Array elements 
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that exhibited at least about a two-fold change in expression, a signal-to-background ratio of at least 
2.5, and an element spot size of at least 40% were identified as differentially expressed using the 
GEMTOOLS program (Lucyte Genomics). 
Expression 

5 In one example, SEQ ID NO:41 showed differential expression in several human breast 

cancer cell lines, as determined by microarray analysis. HMEC is a human primary mammary 
epithelial cell strain derived from normal mammary tissue (Clonetics, San Diego, CA). The following 
cell lines were tested on microarrays: MCF-10A is a human breast mammary gland cell line isolated 
from a 36-year-old female with fibrocystic breast disease; SkBR3 is a breast adenocarcinoma cell line 
10 isolated from a malignant pleural effusion of a 43 -year-old female; MCF7 is abreast adenocarcinoma 
cell line derived from the pleural effusion of a 69-year-old female; T47D is a breast carcinoma cell 
line derived from a pleural effusion from a 54-year-old female with an infiltrating ductal carcinoma of 
the breast; BT20 is a breast carcinoma cell line derived in vitro from cells emigrating out of thin slices 
of a tumor mass isolated from a 74-year-old female; MDA-mb-231 is a metastatic breast tumor cell 
15 line derived from the pleural effusion of a 5 1-year-old female with metastatic breast carcinoma. All 

cell cultures were propagated in media according to the supplier* s recommendations and grown to 70- 
80% confluence prior to RNA isolation. 

The expression of cDNAs from the five tumor cell lines representing various stages of breast 
tumor progression (BT20, MCF7, MDA-mb-231, SKBr3, and T47D) were compared with that of the 
20 non-malignant mammary epithelial cell lines, HMEC or MCF-10A. 

SEQ ID NO:41 showed at least two-fold differential expression when comparing HMEC cells 
versus Sk-BR-3 and T-47D cells. Additionally, SEQ ID NO:41 expression was decreased at least 
two-fold when comparing breast cells from fibrocystic breast tissue versus BT-20, MCF-7, MDA-mb- 
231, Sk-BR-3, and T-47D cancerous cell lines. These experiments indicate that SEQ ID NO:41 was 
25 significantly under-expressed in the breast tumor cell lines tested, further establishing the utility of SEQ 
ID NO:41 as a diagnostic marker or as a potential therapeutic target for breast cancer. 

In another example, SEQ ID NO:44 is upregulated 3.9 fold in DC as compared to monocytes, 
suggesting that SEQ ID NO:44, encoding SEQ ID NO:8, could be used for example, to understand the 
process by which monocytes differentiate into immature dendritic cells and eventually allow 
30 manipulation of the immune system leading to potential immunotherapies for diseases such as cancer, 
AIDS, and infectious diseases; and enhancing vaccine efficacy. 

In another example, the expression of SEQ ED NO:48 is upregulated in six out of seven 
PBMC populations (each of which was obtained from a different donor) treated with Staphylococcal 
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exotoxins (SEB). The PBMCs were stimulated in vitro with SEB for 24 and 72 hours. The 
expression of SEQ ID NO:48 was higher after 24 hours and dropped after 72 hours. Therefore, SEQ 
TP NO:48 is useful in diagnostic assays for immune responses. 

In yet another example, SEQ ID NO:64 showed differential expression in the C3 A cell line, a 

5 well-established in vitro model of the mature human liver (Mickelson, J.K. et al. (1995) Hepatology 
22:866-875; Nagendra, A.R. et al (1997) Am. J. Physiol. 272:G408-G416), as determined by 
micro array analysis. The effects upon liver metabolism and hormone clearance mechanisms are 
important to understand the pharmacodynamics of a drug. For example, the human C3 A cell line is a 
clonal derivative of HepG2/C3 (hepatoma cell line, isolated from a 15-year-old male with liver tumor), 

10 which was selected for strong contact inhibition of growth. The use of a clonal population enhances 
the reproducibility of the cells. C3 A cells have many characteristics of primary human hepatocytes in 
culture: i) expression of insulin receptor and insulin-like growth factor II receptor; ii) secretion of a 
high ratio of serum albumin compared with a-fetoprotein; iii) conversion of ammonia to urea and 
ghitamine; iv) abilitiy to metabolize aromatic amino acids; and v) proliferation in glucose- free and 

15 insulin-free medium. SEQ ED NO:64 showed differential expression in C3 A cells treated with a 
variety of steroids including beclomethasone, medroxyprogesterone, budesonide, prednisone, 
dexamethasone, and progesterone, versus untreated C3 A cells, as determined by microarray analysis. 
Therefore, SEQ ID NO:64 is useful for the diagnosis and monitoring of liver, endocrine, and 
reproductive diseases and in the diagnosis of and as a therapeutic target for inflammatory diseases and 

20 humoral immune response. 

XII. Complementary Polynucleotides 

Sequences complementary to the NAAP-encoding sequences, or any parts thereof, are used 
to detect, decrease, or inhibit expression of naturally occurring NAAP. Although use of 
oligonucleotides comprising from about 15 to 30 base pairs is described, essentially the same 

25 procedure is used with smaller or with larger sequence fragments. Appropriate oligonucleotides are 
designed using OLIGO 4.06 software (National Biosciences) and the coding sequence of NAAP. To 
inhibit transcription, a complementary oligonucleotide is designed from the most unique 5' sequence 
and used to prevent promoter binding to the coding sequence. To inhibit translation, a complementary 
oligonucleotide is designed to prevent ribosomal binding to the NAAP-encoding transcript 

30 XUI. Expression of NAAP 

Expression and purification of NAAP is achieved using bacterial or virus-based expression 
systems. For expression of NAAP in bacteria, cDNA is subcloned into an appropriate vector 
containing an antibiotic resistance gene and an inducible promoter that directs high levels of cDNA 
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transcription. Examples of such promoters include, but are not limited to, the tip-lac (tac) hybrid 
promoter and the T5 or T7 bacteriophage promoter in conjunction with the lac operator regulatory 
element Recombinant vectors are transformed into suitable bacterial hosts, e.g., BL21(DE3). 
Antibiotic resistant bacteria express NAAP upon induction with isopropyl beta-D- 

5 thiogalactopyranoside (TPTG). Expression of NAAP in eukaryotic cells is achieved by infecting insect 
or mammalian cell lines with recombinant Autographica californica nuclear polyhedrosis virus 
(AcMNPV), commonly known as baculovirus. The nonessential polyhedrin gene of baculovirus is 
replaced with cDNA encoding NAAP by either homologous recombination or bacterial-mediated 
transposition involving transfer plasmid intermediates. Viral infectivity is maintained and the strong 

10 polyhedrin promoter drives high levels of cDNA transcription. Recombinant baculovirus is used to 
infect Spodopterafrugiperda (Sf9) insect cells in most cases, or human hepatocytes, in some cases. 
Infection of the latter requires additional genetic modifications to baculovirus (See Engelhard, E.K. et 
aL (1994) Proc. Natl. Acad. Sci. USA 91:3224-3227; Sandig, V. et al. (1996) Hum. Gene Ther. 
7:1937-1945.) 

15 In most expression systems, NAAP is synthesized as a fusion protein with, e.g., glutathione S- 

transferase (GST) or a peptide epitope tag, such as FLAG or 6-His, permitting rapid, single-step, 
affinity-based purification of recombinant fusion protein from crude cell lysates. GST, a 26-kilodalton 
enzyme from Schistosoma japonicum, enables the purification of fusion proteins on immobilized 
glutathione under conditions that maintain protein activity and antigenicity (Amersham Biosciences). 

20 Following purification, the GST moiety can be proteolytically cleaved from NAAP at specifically 
engineered sites. FLAG, an 8-amino acid peptide, enables immunoaffinity purification using 
commercially available monoclonal and polyclonal anti-FLAG antibodies (Eastman Kodak). 6-His, a 
stretch of six consecutive histidine residues, enables purification on metal-chelate resins (QIAGEN). 
Methods for protein expression and purification are discussed in Ausubel (1995, supra, ch. 10 and 16). 

25 Purified NAAP obtained by these methods can be used directly in the assays shown in Examples 
XVII, XVm, and XIX, where applicable. 
XIV. Functional Assays 

NAAP function is assessed by expressing the sequences encoding NAAP at physiologically 
elevated levels in mammalian cell culture systems. cDNA is subcloned into a mammalian expression 

30 vector containing a strong promoter that drives high levels of cDNA expression. Vectors of choice 

include PCMV SPORT plasmid (Invitrogen, Carlsbad CA) and PCR3.1 plasmid (Invitrogen), both of 
which contain the cytomegalovirus promoter. 5-10 >ug of recombinant vector are transiently 
transfected into a human cell line, for example, an endothelial or hematopoietic cell line, using either 
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liposome formulations or electroporation 1-2 /^g of an additional plasmid containing sequences 
encoding a marker protein are co-transfected. Expression of a marker protein provides a means to 
distinguish transfected cells from nontransfected cells and is a reliable predictor of cDNA expression 
from the recombinant vector. Marker proteins of choice include, e.g., Green Fluorescent Protein 

5 (GFP; Clontech), CD 64, or a CD64-GFP fusion protein. How cytometry (FCM), an automated, laser 
optics-based technique, is used to identify transfected cells expressing GFP or CD64-GFP and to 
evaluate the apoptotic state of the cells and other cellular properties. FCM detects and quantifies the 
uptake of fluorescent molecules that diagnose events preceding or coincident with cell death. These 
events include changes in nuclear DNA content as measured by staining of DNA with propidium 

10 iodide; changes in cell size and granularity as measured by forward light scatter and 90 degree side 
light scatter; down-regulation of DNA synthesis as measured by decrease in bromodeoxyuridine 
uptake; alterations in expression of cell surface and intracellular proteins as measured by reactivity 
with specific antibodies; and alterations in plasma membrane composition as measured by the binding 
of fhiorescein-conjugated Annexin V protein to the cell surface. Methods in flow cytometry are 

15 discussed in Ormerod, M.G. (1994) How Cytometry , Oxford, New York NY. 

The influence of NAAP on gene expression can be assessed using highly purified populations 
of cells transfected with sequences encoding NAAP and either CD64 or CD64-GFP. CD64 and 
CD64-GFP are expressed on the surface of transfected cells and bind to conserved regions of human 
immunoglobulin G (IgG). Transfected cells are efficiently separated from nontransfected cells using 

20 magnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Lake Success 
NY). mRNA can be purified from the cells using methods well known by those of skill in the art. 
Expression of mRNA encoding NAAP and other genes of interest can be analyzed by northern 
analysis or microarray techniques. 
XV. Production of NAAP Specific Antibodies 

25 NAAP substantially purified using polyacrylamide gel electrophoresis (PAGE; see, e.g., 

Harrington, M.G. (1990) Methods Enzymol 182:488-495), or other purification techniques, is used to 
immu nize animals (e.g., rabbits, mice, etc.) and to produce antibodies using standard protocols. 

Alternatively, the NAAP amino acid sequence is analyzed using LASERGENE software 
(DNASTAR) to determine regions of high immunogenicity, and a corresponding oligopeptide is 

30 synthesized and used to raise antibodies by means known to those of skill in the art Methods for 

selection of appropriate epitopes, such as those near the C-texminus or in hydrophilic regions are well 
described in the art (Ausubel, 1995, supra, ch. 11). 

Typically, oligopeptides of about 15 residues in length are synthesized using an ABI 431 A 
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peptide synthesizer (Applied Biosystems) using FMOC chemistry and coupled to KLH (Sigma- 
Aldrich, St. Louis MO) by reaction with N-maleimidobeiizoyl-N4iydroxysuccinimide ester (MBS) to 
increase immunogenicity (Ausubel, 1995, supra). Rabbits are immunized with the oligopeptide-KLH 
complex in complete Freund's adjuvant Resulting antisera are tested for antipeptide and anti-NAAP 
5 activity by, for example, binding the peptide or NAAP to a substrate, blocking with 1 % BS A, reacting 
with rabbit antisera, washing, and reacting with radio-iodinated goat anti-rabbit IgG. 

XVI. Purification of Naturally Occurring NAAP Using Specific Antibodies 
Naturally occurring or recombinant NAAP is substantially purified by immuno affinity 

chromatography using antibodies specific for NAAP. An immunoaffinity column is constructed by 
10 covalently coupling anti-NAAP antibody to an activated chromatographic resin, such as 

CNBr-activated SEPHAROSE (Amersham Biosciences). After the coupling, the resin is blocked and 

washed according to the manufacturer's instructions. 

Media containing NAAP are passed over the immunoaffinity column, and the column is 

washed under conditions that allow the preferential absorbance of NAAP (e.g., high ionic strength 
15 buffers in the presence of detergent). The column is eluted under conditions that disrupt 

antibody/NAAP binding (e.g., a buffer of pH 2 to pH 3, or a high concentration of a chaotrope, such 

as urea or thiocyanate ion), and NAAP is collected. 

XVII. Identification of Molecules Which Interact with NAAP 

NAAP, or biologically active fragments thereof, are labeled with 125 I Bolton-Hunter reagent 
20 (Bolton, A.E. and W.M. Hunter (1973) Biochem. J. 133:529-539). Candidate molecules previously 
arrayed in the wells of a multi-well plate are incubated with the labeled NAAP, washed, and any wells 
with labeled NAAP complex are assayed. Data obtained using different concentrations of NAAP are 
used to calculate values for the number, affinity, and association of NAAP with the candidate 
molecules. 

25 Alternatively, molecules interacting with NAAP are analyzed using the yeast two-hybrid 

system as described in Fields, S. and O. Song (1989) Nature 340:245-246, or using commercially 
available kits based on the two-hybrid system, such as the MATCHMAKER system (Clontech). 

NAAP may also be used in the PATHCALLING process (CuraGen Corp., New Haven CT) 
which employs the yeast two-hybrid system in a high-throughput manner to determine all interactions 

30 between the proteins encoded by two large libraries of genes (Nandabalan, K. et al. (2000) U.S. 
Patent No. 6,057,101). 

XVIII. Demonstration of NAAP Activity 

NAAP activity is measured by its ability to stimulate transcription of a reporter gene (Liu, 
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H. Y. et aL (1997) EMBO J. 16:5289-5298). The assay entails the use of a well characterized 
reporter gene construct, LexA^-LacZ, that consists of LexA DNA transcriptional control elements 
(LexA op ) fused to sequences encoding the E. coli LacZ enzyme. The methods for constructing and 
expressing fusion genes, introducing them into cells, and measuring LacZ enzyme activity, are well 
known to those skilled in the art Sequences encoding NAAP are cloned into a plasmid that directs 
the synthesis of a fusion protein, LexA-NAAP, consisting of NAAP and a DNA binding domain 
derived from the LexA transcription factor. The resulting plasmid, encoding a LexA-NAAP fusion 
protein, is introduced into yeast cells along with a plasmid containing the LexA^-LacZ reporter gene. 
The amount of LacZ enzyme activity associated with LexA-NAAP transfected cells, relative to 
control cells, is proportional to the amount of transcription stimulated by the NAAP. 

Alternatively, NAAP activity is measured by its ability to bind zinc. A 5-10 jxM sample 
solution in 2.5 mM ammonium acetate solution at pH 7.4 is combined with 0.05 M zinc sulfate sohitio 
(Aldrich, Milwaukee WI) in the presence of 100 uM ditbiothreitol with 10% methanol added. The 
sample and zinc sulfate solutions are allowed to incubate for 20 minutes. The reaction solution is 
passed through a VYDAC column (Grace Vydac, Hesperia, CA) with approximately 300 Angstrom 
bore size and 5 \iM particle size to isolate zinc-sample complex from the solution, and into a mass 
spectrometer (PE Sciex, Ontario, Canada). Zinc bound to sample is quantified using the functional 
atomic mass of 63.5 Da observed by Whittal et al (2000; Biochemistry 39:8406-8417). 

In the alternative, a method to determine nucleic acid binding activity of NAAP involves a 
polyacrylamide gel mobffity-shift assay. In preparation for this assay, NAAP is expressed by 
transforming a mammalian cell line such as COS7, HeLa or CHO with a eukaryotic expression vector 
containing NAAP cDNA. The cells are incubated for 48-72 hours after transformation under 
conditions appropriate for the cell line to allow expression and accumulation of NAAP. Extracts 
containing solubilized proteins can be prepared from cells expressing NAAP by methods well known 
in the art. Portions of the extract containing NAAP are added to [^-labeled RNA or DNA. 
Radioactive nucleic acid can be synthesized in vitro by techniques well known in the art The 
mixtures are incubated at 25°C in the presence of RNase- and DNase-inhibitors under buffered 
conditions for 5-10 minutes. After incubation, the samples are analyzed by polyacrylamide gel 
electrophoresis followed by autoradiography. The presence of a band on the autoradiogram indicates 
the formation of a complex between NAAP and the radioactive transcript A band of similar mobility 
will not be present in samples prepared using control extracts prepared from untransformed cells. 

In the alternative, a method to determine methylase activity of NAAP measures transfer of 
radiolabeled methyl groups between a donor substrate and an acceptor substrate. Reaction mixtures 
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(50 fil final volume) contain 15 mM HEPES, pH 7.9, 1.5 mM MgCij, 10 mM dithiothreitol, 3% 
polyvinylalcohol, 1.5 ixCi [methyl~ 3 K\AdoMet (0.375 fiM AdoMet) (DuPont-NEN), 0.6 \ig NAAP, and 
acceptor substrate (e.g., 0.4 fig [ 35 S]RNA, or 6-mercaptopurine (6-MP) to 1 mM final concentration). 
Reaction mixtures are incubated at 30°C for 30 minutes, then 65°C for 5 minutes. 

5 Analysis of [methyl- 3 JIfRNA is as follows: (1) 50 fil of 2 x loading buffer (20 mM Tris-HCl, 

pH 7.6, 1 M LiCl, 1 mM EDTA, 1% sodium dodecyl sulphate (SDS)) and 50 fil oligo d(T)-celhilose 
(10 mg/ml in 1 x loading buffer) are added to the reaction mixture, and incubated at ambient 
temperature with shaking for 30 minutes. (2) Reaction mixtures are transferred to a 96-well filtration 
plate attached to a vacuum apparatus. (3) Each sample is washed sequentially with three 2.4 ml 
10 aliquots of 1 x oligo d(T) loading buffer containing 0.5% SDS, 0.1% SDS, or no SDS. (4) RNA is 
eluted with 300 fil of water into a 96-well collection plate, transferred to scintillation vials containing 
liquid scintillant, and radioactivity determined. 

Analysis of [methyl- 3 K}6-MP is as follows: (1) 500 fil 0.5 M borate buffer, pH 10.0, and then 
2.5 ml of 20% (v/v) isoamyl alcohol in toluene are added to the reaction mixtures. (2) The samples 

15 are mixed by vigorous vortexing for ten seconds. (3) After centrifugation at 700g for 10 minutes, 1.5 
ml of the organic phase is transferred to scintillation vials containing 0.5 ml absolute ethanol and liquid 
scintillant, and radioactivity determined. (4) Results are corrected for the extraction of 6-MP into the 
organic phase (approximately 41%). For both [methyl- 3 lSJRNA and [metfryZ- 3 H]6-MP, NAAP 
activity is proportional to the measured radioactivity. 

20 Alternatively, DNA repair activity of DNAME is measured as incorporation of p^PJdATP 

into a plasmid treated with a DNA damaging agent, such as cisplatin or ultraviolet irradiation, relative 
to a control, untreated plasmid DNA (Coudore, F. et al. (1997) FEBS Lett. 414:581-584). Cell 
extracts are purified from mammalian cell lines, E. coli 7 or 5. cerevisiae having compromised 
endogenous repair activities due to mutations in repair enzymes. Cell extracts are prepared by 

25 hypotonic lysis of cells followed by centrifugation at 300,000 x g. Extracts are treated with 63% 

ammonium sulfate to minimize non-specific nuclease activity. The repair synthesis assay is performed 
in a 50 /il reaction volume containing 200 fig protein in cell extract, 300 ng damaged plasmid, 300 ng 
control plasmid, 4 fiM dATP, 20 fiM each dCTP, dTTP, and dGTP, 0.2 fiM f 2 P]dATP, 20 mM 
HEPES-KOH (pH 7.8), 2.5 \ig creatine phosphokinase, 7 mM MgCl*, and 2 mM EGTA. Identical 

30 reactions are set up with and without purified DNAME. After a 3 h incubation at 30°C, reaction 
mixtures are treated with 200 /xg/ml proteinase K and 0.5% SDS. Plasmid DNA is purified from 
reaction mixtures by phenol-chloroform extraction and ethanol precipitation. Data is quantified by gel 
electrophoresis of linearized plasmid followed by autoradiography, scintillation counting of excised 
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DNA bands, and densitometry of the photographic negative of the gel to normalize for plasmid DNA 
recovery. 

In the alternative, type I topoisomerase activity of NAAP can be assayed based on the 
relaxation of a supercoiled DNA substrate. NAAP is incubated with its substrate in a buffer lacking 
5 Mg 2+ and ATP, the reaction is terminated, and the products are loaded on an agarose gel. Altered 

topoisomers can be distinguished from supercoiled substrate electrophoretically. This assay is specific 
for type I topoisomerase activity because Mg 2+ and ATP are necessary cofactors for type n 
topoisomerases. 

Type II topoisomerase activity of NAAP can be assayed based on the decatenation of a 

10 kinetoplast DNA (KDNA) substrate. NAAP is incubated with KDNA, the reaction is terminated, 

and the products are loaded on an agarose gel Monomelic circular KDNA can be distinguished from 
catenated KDNA electrophoretically. Kits for measuring type I and type II topoisomerase activities 
are available commercially from Topogen (Columbus OH). 

ATP-dependent RNA helicase unwinding activity of NAAP can be measured by the method 

15 described by Zhang and Grosse (1994; Biochemistry 33:3906-39 12). The substrate for RNA 

unwinding consists of 32 P-labeled RNA composed of two RNA strands of 194 and 130 nucleotides in 
length containing a duplex region of 17 base-pairs. The RNA substrate is incubated together with 
ATP, Mg 2+ , and varying amounts of NAAP in a Tris-HCl buffer, pH 7.5, at 37°C for 30 minutes. The 
single-stranded RNA product is then separated from the double-stranded RNA substrate by 

20 electrophoresis through a 10% SDS-polyacrylamide gel, and quantitated by autoradiography. The 

amount of single-stranded RNA recovered is proportional to the amount of NAAP in the preparation. 

In the alternative, NAAP function is assessed by expressing the sequences encoding NAAP 
at physiologically elevated levels in mammalian cell culture systems. cDNA is subcloned into a 
mammalian expression vector containing a strong promoter that drives high levels of cDNA 

25 expression. Vectors of choice include pCMV SPORT (life Technologies) and pCR3 . 1 (Invitrogen 
Corporation, Carlsbad CA), both of which contain the cytomegalovirus promoter. 5-10 fig of 
recombinant vector are transiently transfected into a human cell line, preferably of endothelial or 
hematopoietic origin, using either liposome formulations or electroporation. 1-2 ^g of an additional 
plasmid containing sequences encoding a marker protein are co -transfected . 
30 Expression of a marker protein provides a means to distinguish transfected cells from 

nontransfected cells and is a reliable predictor of cDNA expression from the recombinant vector. 
Marker proteins of choice include, e.g., Green Fluorescent Protein (GEP; CLONTECH), CD64, or a 
CD64-GFP fusion protein. Flow cytometry (FCM), an automated laser optics-based technique, is 
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used to identify transfected cells expressing GFP or CD64-GFP and to evaluate the apoptotic state of 
the cells and other cellular properties. 

FCM detects and quantifies the uptake of fluorescent molecules that diagnose events 
preceding or coincident with cell death. These events include changes in nuclear DNA content as 
5 measured by staining of DNA with propidium iodide; changes in cell size and granularity as measured 
by forward light scatter and 90 degree side light scatter; down-regulation of DNA synthesis as 
measured by decrease in bromodeoxyuridine uptake; alterations in expression of cell surface and 
intracellular proteins as measured by reactivity with specific antibodies; and alterations in plasma 
membrane composition as measured by the binding of fhiorescein-conjugated Annexin V protein to the 
10 cell surface. Methods in flow cytometry are discussed in Ormerod, M. G. (1994) Flow Cytometry, 
Oxford, New York NY. 

The influence of NAAP on gene expression can be assessed using highly purified populations 
of cells transfected with sequences encoding NAAP and either CD64 or CD64-GFP. CD64 and 
CD64-GEP are expressed on the surface of transfected cells and bind to conserved regions of human 
15 immunoglobulin G (IgG). Transfected cells are efficiently separated from nontransfected cells using 
magnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Inc., Lake 
Success NY). mRNA can be purified from the cells using methods well known by those of skill in the 
art. Expression of mRNA encoding NAAP and other genes of interest can be analyzed by northern 
analysis or microarray techniques. 
20 Pseudouridine synthase activity of NAAP is assayed using a tritium (^H) release assay 

modified from Nurse et al. (1995; RNA 1:102-112), which measures the release of 3 H from the C 5 
position of the pyrimidine component of uridylate (U) when 3 H-radiolabeled U in RNA is isomerized to 
pseudouridine (\|r). A typical 500 /d assay mixture contains 50 mM HEPES buffer (pH 7.5), 100 mM 
ammonium acetate, 5 mM dithiothreitol, 1 mM EDTA, 30 units RNase inhibitor, and 0.1-4.2 /xM 
25 [5- 3 H]tiRNA (approximately 1 /iCi/nmol tRNA). The reaction is initiated by the addition of <5 fil of a 
concentrated solution of NAAP (or sample containing NAAP) and incubated for 5 min at 37 °C. 
Portions of the reaction mixture are removed at various times (up to 30 min) following the addition of 
NAAP and quenched by dilution into 1 ml 0. 1 M HC1 containing Norit-S A3 (12% w/v). The 
quenched reaction mixtures are centrifiiged for 5 min at maximum speed in a microcentrifuge, and the 
30 supernatants are filtered through a plug of glass wool. The pellet is washed twice by resuspension in 1 
ml 0.1 M HC1, followed by centrifugation. The supernatants from the washes are separately passed 
through the glass wool plug and combined with the original filtrate. A portion of the combined filtrate 
is mixed with scintillation fluid (up to 10 ml) and counted using a scintillation counter. The amount of 
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3 H released from the RNA and present in the soluble filtrate is proportional to the amount of 
peudouridine synthase activity in the sample (Ramamurthy, V. (1999) J. Biol. Chem. 
274:22225-22230). 

In the alternative, pseudouridine synthase activity of NAAP is assayed at 30 °C to 37 °C in a 

5 mixture containing 100 mM Tris-HCl (pH 8.0), 100 mM ammonium acetate, 5 mM MgCL,, 2 mM 

difhiothreitol, 0.1 mM EDTA, and 1-2 fmol of I^PJ-radiolabeled runoff transcripts (generated in vitro 
by an appropriate RNA polymerase, i.e., T7 or SP6) as substrates. NAAP is added to initiate the 
reaction or omitted from the reaction in control samples. Following incubation, the RNA is extracted 
with phenol-chloroform, precipitated in ethanol, and hydrolyzed completely to 3-nucleotide 

10 monophosphates using RNase T 2 . The hydrolysates are analyzed by two-dimensional thin layer 

chromatography, and the amount of 32 P radiolabel present in the \\rMP and IMP spots are evaluated 
after exposing the thin layer chromatography plates to film or a Phosphorhnager screen. Taking into 
account the relative number of uridylate residues in the substrate RNA, the relative amount \|/MP and 
UMP are determined and used to calculate the relative amount of \|/ per tRNA molecule (expressed in 

15 mol \y /mol of tRNA or mol \j/ /mol of tRN A/minute), which corresponds to the amount of 
pseudouridine synthase activity in the NAAP sample (Lecointe, supra). 

N 2 > N^-dimethylguanosine transferase ((m 2 2 G)methyltransferase) activity of NAAP is 
measured in a 160 /xl reaction mixture containing 100 mM Tris-HCl (pH 7.5), 0:1 mM EDTA, 10 mM 
MgCl 2 , 20 mM NH+C1, ImM difhiothreitol, 6.2 fiM 5-adenosyl-L-[me^y/- 3 H]methionine (30-70 

20 Ci/mM), 8 fig m 2 2 G-deficient tRNA or wild type tRNA from yeast, and approximately 100 jig of 

purified NAAP or a sample comprising NAAP. The reactions are incubated at 30 °C for 90 rain and 
chilled on ice. A portion of each reaction is diluted to 1 ml in water containing 100 fig BS A. 1 ml of 2 
M HC1 is added to each sample and the acid insoluble products are allowed to precipitate on ice for 20 
miTi before being collected by filtration through glass fiber filters. The collected material is washed 

25 several times with HC1 and quantitated using a liquid scintillation counter. The amount of 3 H 
incorporated into the m 2 2 G-deficient, acid-insoluble tRNAs is proportional to the amount of 
N 2 ,N 2 -dimethylgu ano sine transferase activity in the NAAP sample. Reactions comprising no 
substrate tRNAs, or wild-type tRNAs that have already been modified, serve as control reactions 
which should not yield acid-insoluble 3 H-labeled products. 

30 Polyadenylation activity of NAAP is measured using an in vitro polyadenylation reaction. 

The reaction mixture is assembled on ice and comprises 10 fil of 5 mM difhiothreitol, 0.025% (v/v) 
NONIDET P-40, 50 mM creatine phosphate, 6.5% (w/v) polyvinyl alcohol, 0.5 unit/jd RNAGUARD 
(Pharmacia), 0.025 /ig//il creatine kinase, 1.25 mM cordycepin 5-triphosphate, and 3.75 mM MgCl 2 , in 
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a total volume of 25 /xl. 60 finol of CstF, 50 finol of CPSF, 240 finol of PAP, 4 /xl of crude or partially 
purified CF II and various amounts of amounts CF I are then added to the reaction mix. The volume 
is adjusted to 23.5 /xl with a buffer containing 50 mM TrisHCl, pH 7.9, 10% (v/v) glycerol, and 0.1 mlV 
Na-EDTA. The final ammonium sulfate concentration should be below 20 mM. The reaction is 
5 initiated (on ice) by the addition of 15 finol of 32 P-labeled pre-mRNA template, along with 2.5 fig of 
unlabeled tRNA, in 1.5 /il of water. Reactions are then incubated at 30 °C for 75-90 min and stopped 
by the addition of 75 /xl (approximately two-volumes) of proteinase K mix (0.2 M Tris-HCl, pH 7.9, 
300 mM NaCl, 25 mM Na-EDTA, 2% (w/v) SDS), 1 /xl of 10 mg/ml proteinase K, 0.25 fil of 20 mg/m 
glycogen, and 23.75 /xl of water). Following incubation, the RNA is precipitated with ethanol and 

10 analyzed on a 6% (w/v) polyacrylamide, 8.3 M urea sequencing gel. The dried gel is developed by 

autoradiography or using a phosphoimager. Cleavage activity is determined by comparing the amount 
of cleavage product to the amount of pre-mRNA template. The omission of any of the polypeptide 
components of the reaction and substitution of NAAP is useful for identifying the specific biological 
function of NAAP in pre-mRNA polyadenylation (Ruegsegger, supra; and references within). 

15 tRNA synthetase activity is measured as the aminoacylation of a substrate tRNA in the 

presence of [ 14 C]-labeled amino acid. NAAP is incubated with [ 14 C]-labeled amino acid and the 
appropriate cognate tRNA (for example, [ 14 C]alanine and tRNA^) in a buffered solution, re- 
labeled product is separated from free [ 14 C]amino acid by chromatography, and the incorporated 14 C 
is quantified by scintillation counter. The amount of 14 C-labeled product detected is proportional to the 

20 activity of NAAP in this assay. 

In the alternative, NAAP activity is measured by incubating a sample containing NAAP in a 
solution containing 1 mM ATP, 5 mM Hepes-KOH (pH 7.0), 2.5 mM KC1, 1.5 mM magnesium 
chloride, and 0.5 mM DTT along with misacylated [ 14 C]-Glu-tRNAGln (e.g., 1 /xM) and a similar 
concentration of unlabeled L-glutamine. Following the quenching of the reaction with 3 M sodium 

25 acetate (pH 5.0), the mixture is extracted with an equal volume of water-saturated phenol, and the 
aqueous and organic phases are separated by centrifugation at 15,000 x g at room temperature for 1 
min. The aqueous phase is removed and precipitated with 3 volumes of ethanol at -70°C for 15 min. 
The precipitated aminoacyl-tRNAs are recovered by centrifugation at 15,000 x g at 4°C for 15 min. 
The pellet is resuspended in of 25 mM KOH, deacylated at 65°C for 10 min., neutralized with 0. 1 M 

30 HC1 (to final pH 6-7), and dried under vacuum. The dried pellet is resuspended in water and spotted 
onto a cellulose TLC plate. The plate is developed in either isopropanol/formic acid/water or 
ammonia/water/chloroform/ methanol. The image is subjected to densitometric analysis and the 
relative amounts of Glu and Gin are calculated based on the Rf values and relative intensities of the 
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spots. NAAP activity is calculated based on the amount of Gin resulting from the transformation of 
Ghi while acylated as Ghi-tRNA Ghl (adapted from Curnow, A.W. et al. (1997) Proc. NatL Acad. Sci. 
USA 94:11819-26). 

XIX. Identification of NAAP Agonists and Antagonists 
5 Agonists or antagonists of NAAP activation or inhibition may be tested using the assays 

described in section XVIII. Agonists cause an increase in NAAP activity and antagonists cause a 
decrease in NAAP activity. 

Various modifications and variations of the described compositions, methods, and systems of 
10 the invention will be apparent to those skilled in the art without departing from the scope and spirit of 
the invention. It will be appreciated that the invention provides novel and useful proteins, and their 
encoding polynucleotides, which can be used in the drug discovery process, as well as methods for 
using these compositions for the detection, diagnosis, and treatment of diseases and conditions. 
Although the invention has been described in connection with certain embodiments, it should be 
15 understood that the invention as claimed should not be unduly limited to such specific embodiments. 

Nor should the description of such embodiments be considered exhaustive or limit the invention to the 
precise forms disclosed. Furthermore, elements from one embodiment can be readily recombined with 
elements from one or more other embodiments. Such combinations can form a number of 
embodiments within the scope of the invention. It is intended that the scope of the invention be 
20 defined by the following claims and their equivalents. 
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Incyte Project ID: 
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What is claimed is: 

1 . An isolated polypeptide selected from the group consisting of: 

a) a polypeptide comprising an amino acid sequence selected from the group consisting of 
SEQIDNO:l-36, 

b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% 
identical to an amino acid sequence selected from the group consisting of SEQ ID 
NO:l, SEQ ID NO:3-7, SEQ ID NO:10-21, SEQ ID NO:24-30, SEQ ID NO:32, SEQ 
ID NO:34, and SEQ ID NO:36, 

c) a polypeptide comprising a naturally occurring amino acid sequence at least 91% 
identical to the amino acid sequence of SEQ ID NO:35, 

d) a polypeptide comprising a naturally occurring amino acid sequence at least 95% 
identical to the amino acid sequence of SEQ ID NO:9, 

e) a polypeptide comprising a naturally occurring amino acid sequence at least 97% 
identical to the amino acid sequence of SEQ ID NO:22, 

f) a polypeptide comprising a naturally occurring amino acid sequence at least 98% 
identical to an amino acid sequence selected from the group consisting of SEQ ID 
NO:2 and SEQ ID NO:33, 

g) a polypeptide comprising a naturally occurring amino acid sequence at least 99% 
identical to an amino acid sequence selected from the group consisting of SEQ ID 
NO:23 and SEQ ID NO:3 1 , 

h) a biologically active fragment of a polypeptide having an amino acid sequence selected 
from the group consisting of SEQ ID NO:l-36, and 

i) an immunogenic fragment of a polypeptide having an amino acid sequence selected 
from the group consisting of SEQ ID NO:l-36. 

2. An isolated polypeptide of claim 1 comprising an amino acid sequence selected from the 
group consisting of SEQ ID NO:l-36. 

3. An isolated polynucleotide encoding a polypeptide of claim 1. 

4. An isolated polynucleotide encoding a polypeptide of claim 2. 

5. An isolated polynucleotide of claim 4 comprising a polynucleotide sequence selected from 
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the group consisting of SEQ ID NO:37-72. 

6. A recombinant polynucleotide comprising a promoter sequence operably linked to a 
polynucleotide of claim 3. 

7. A cell transformed with a recombinant polynucleotide of claim 6. 

8. A transgenic organism comprising a recombinant polynucleotide of claim 6. 

9. A method of producing a polypeptide of claim 1 , the method comprising: 

a) ' culturing a cell under conditions suitable for expression of the polypeptide, wherein 

said cell is transformed with a recombinant polynucleotide, and said recombinant 
polynucleotide comprises a promoter sequence operably linked to a polynucleotide 
encoding the polypeptide of claim 1 , and 

b) recovering the polypeptide so expressed. 

10. A method of claim 9, wherein the polypeptide comprises: an amino acid sequence selected 
from the group consisting of SEQ ID NO:l-36. 

1 1 . An isolated antibody which specifically binds to a polypeptide of claim 1 . 

12. An isolated polynucleotide selected from the group consisting of: 

a) a polynucleotide comprising a polynucleotide sequence selected from the group 
consisting of SEQ ID NO:37-72, 

b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
90% identical to a polynucleotide sequence selected from the group consisting of SEQ 
ID NO.37-43 and SEQ ID NO:46-71, 

c) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
92% identical to the polynucleotide sequence of SEQ ID NO:72, 

d) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
95% identical to the polynucleotide sequence of SEQ ID NO:45, 

e) a polynucleotide complementary to a polynucleotide of a), 

f) a polynucleotide complementary to a polynucleotide of b), 

g) a polynucleotide complementary to a polynucleotide of c), 
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h) a polynucleotide complementary to a polynucleotide of d), and 

i) an RNA equivalent of a)-h). 

13. An isolated polynucleotide comprising at least 60 contiguous nucleotides of a 
5 polynucleotide of claim 12. 

14. A method of detecting a target polynucleotide in a sample, said target polynucleotide 
having a sequence of a polynucleotide of claim 12, the method comprising: 

a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides 

10 comprising a sequence complementary to said target polynucleotide in the sample, and 

which probe specifically hybridizes to said target polynucleotide, under conditions 
whereby a hybridization complex is formed between said probe and said target 
polynucleotide or fragments thereof, and 

b) detecting the presence or absence of said hybridization complex, and, optionally, if 
15 present, the amount thereof. 

15. A method of claim 14, wherein the probe comprises at least 60 contiguous nucleotides. 

1 6. A method of detecting a target polynucleotide in a sample, said target polynucleotide 
20 having a sequence of a polynucleotide of claim 12, the method comprising: 

a) amplifying said target polynucleotide or fragment thereof using polymerase chain 
reaction amplification, and 

b) detecting the presence or absence of said amplified target polynucleotide or fragment 
thereof, and, optionally, if present, the amount thereof. 

25 

17. A composition comprising a polypeptide of claim 1 and a pharmaceutical^ acceptable 
excipient. 

1 8. A composition of claim 17, wherein the polypeptide comprises an amino acid sequence 
30 selected from the group consisting of SEQ ID NO:l-36. 

19. A method for treating a disease or condition associated with decreased expression of 
functional NAAP, comprising administering to a patient in need of such treatment the composition of 
claim 17. 
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20. A method of screening a compound for effectiveness as an agonist of a polypeptide of 
claim 1, the method comprising: 

a) exposing a sample comprising a polypeptide of claim 1 to a compound, and 

b) detecting agonist activity in the sample. 

21 . A composition comprising an agonist compound identified by a method of claim 20 and a 
pharmaceutical^ acceptable excipient. 

22. A method for treating a disease or condition associated with decreased expression of 
10 functional NAAP, comprising administering to a patient in need of such treatment a composition of 

claim 21. 

23. A method of screening a compound for effectiveness as an antagonist of a polypeptide of 
claim 1 , the method comprising: 

15 a) exposing a sample comprising a polypeptide of claim 1 to a compound, and 

b) detecting antagonist activity in the sample. 

24. A composition comprising an antagonist compound identified by a method of claim 23 and 
a pharmaceutical^ acceptable excipienL 

20 

25. A method for treating a disease or condition associated with overexpression of functional 
NAAP, comprising administering to a patient in need of such treatment a composition of claim 24. 

26. A method of screening for a compound that specifically binds to the polypeptide of claim 
25 1 , the method comprising: 

a) combining the polypeptide of claim 1 with at least one test compound under suitable 
conditions, and 

b) detecting binding of the polypeptide of claim 1 to the test compound, thereby 
identifying a compound that specifically binds to the polypeptide of claim 1. 



30 



27. A method of screening for a compound that modulates the activity of the polypeptide of 
claim 1, the method comprising: 

a) combining the polypeptide of claim 1 with at least one test compound under conditions 
permissive for the activity of the polypeptide of claim 1 , 
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b) assessing the activity of the polypeptide of claim 1 in the presence of the test 
compound, and 

c) comparing the activity of the polypeptide of claim 1 in the presence of the test 
compound with the activity of the polypeptide of claim 1 in the absence of the test 

5 compound, wherein a change in the activity of the polypeptide of claim 1 in the 

presence of the test compound is indicative of a compound that modulates the activity 
of the polypeptide of claim 1. 

28. A method of screening a compound for effectiveness in altering expression of a target 
10 polynucleotide, wherein said target polynucleotide comprises a sequence of claim 5, the method 

comprising: 

a) exposing a sample comprising the target polynucleotide to a compound, under 
conditions suitable for the expression of the target polynucleotide, 

b) detecting altered expression of the target polynucleotide, and 

15 c) comparing the expression of the target polynucleotide in the presence of varying 

amounts of the compound and in the absence of the compound. 

29. A method of assessing toxicity of a test compound, the method comprising: 

a) treating a biological sample containing nucleic acids with the test compound, 
20 b) hybridizing the nucleic acids of the treated biological sample with a probe comprising 

at least 20 contiguous nucleotides of a polynucleotide of claim 12 under conditions 
whereby a specific hybridization complex is formed between said probe and a target 
polynucleotide in the biological sample, said target polynucleotide comprising a 
polynucleotide sequence of a polynucleotide of claim 12 or fragment thereof, 
25 c) quantifying the amount of hybridization complex, and 

d) comparing the amount of hybridization complex in the treated biological sample with 
the amount of hybridization complex in an untreated biological sample, wherein a 
difference in the amount of hybridization complex in the treated biological sample is 
indicative of toxicity of the test compound. 



30 



30. A diagnostic test for a condition or disease associated with the expression of NAAP in a 
biological sample, the method comprising: 

a) combining the biological sample with an antibody of claim 1 1 , under conditions 

suitable for the antibody to bind the polypeptide and form an antibody:polypeptide 

202 



BNSDOCID: <WO 03000864A2_I_> 



WO 03/000864 



PCT/US02/21179 



complex, and 

b) detecting the complex, wherein the presence of the complex correlates with the 
presence of the polypeptide in the biological sample. 

5 31. The antibody of claim 11, wherein the antibody is: 

a) a chimeric antibody, 

b) a single chain antibody, 

c) a Fab fragment, 

d) a F(ab')2 fragment, or 
10 e) a humanized antibody. 

32. A composition comprising an antibody of claim 11 and an acceptable excipienL 

33. A method of diagnosing a condition or disease associated with the expression of NAAP in 
15 a subject, comprising administering to said subject an effective amount of the composition of claim 32. 

34. A composition of claim 32, wherein the antibody is labeled. 

35. A method of diagnosing a condition or disease associated with the expression of NAAP in 
20 a subject, comprising administering to said subject an effective amount of the composition of claim 34. 

36. A method of preparing a polyclonal antibody with the specificity of the antibody of claim 
1 1 , the method comprising: 

a) imrmiTnzr n g an animal with a polypeptide consisting of an amino acid sequence selected 
25 from the group consisting of SEQ ID NO:l-36, or an immunogenic fragment thereof, 

under conditions to elicit an antibody response, 

b) isolating antibodies from said animal, and 

c) screening the isolated antibodies with the polypeptide, thereby identifying a polyclonal 
antibody which specifically binds to a polypeptide comprising an amino acid sequence 

30 selected from the group consisting of SEQ ID NO:l-36. 

37. A polyclonal antibody produced by a method of claim 36. 

38. A composition comprising the polyclonal antibody of claim 37 and a suitable carrier. 
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39. A method of maldng a monoclonal antibody with the specificity of the antibody of claim 
1 1 , the method comprising: 

a) immunizing an animal with a polypeptide consisting of an amino acid sequence selected 
from the group consisting of SEQ ID NO: 1-36, or an immunogenic fragment thereof, 

5 under conditions to elicit an antibody response, 

b) isolating antibody producing cells from the animal, 

c) fusing the antibody producing cells with immortalized cells to form monoclonal 
antibody-producing hybridoma cells, 

d) culturing the hybridoma cells, and 

10 e) isolating from the culture monoclonal antibody which specifically binds to a 

polypeptide comprising an amino acid sequence selected from the group consisting of 
SEQIDNO:l-36. 

40. A monoclonal antibody produced by a method of claim 39. 

15 

41 . A composition comprising the monoclonal antibody of claim 40 and a suitable carrier. 

42. The antibody of claim 1 1 , wherein the antibody is produced by screening a Fab expression 

library. 

20 

43. The antibody of claim 1 1 , wherein the antibody is produced by screening a recombinant 
immunoglobulin library. 

44. A method of detecting a polypeptide comprising an amino acid sequence selected from the 
25 group consisting of SEQ ID NO:l-36 in a sample, the method comprising: 

a) incubating the antibody of claim 1 1 with a sample under conditions to allow specific 
binding of the antibody and the polypeptide, and 

b) detecting specific binding, wherein specific binding indicates the presence of a 
polypeptide comprising an amino acid sequence selected from the group consisting of 

30 SEQ ID NO: 1-36 in the san^le. 



45. A method of purifying a polypeptide comprising an amino acid sequence selected from the 
group consisting of SEQ ID NO: 1-36 from a sample, the method comprising: 

a) incubating the antibody of claim 1 1 with a sample under conditions to allow specific 
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binding of the antibody and the polypeptide, and 
b) separating the antibody from the sample and obtaining the purified polypeptide 

comprising an amino acid sequence selected from the group consisting of SEQ ID 
NO:l-36. 

46. A microarray wherein at least one element of the microarray is a polynucleotide of claim 

13. 

47. A method of generating an expression profile of a sample which contains polynucleotides, 
the method comprising: 

a) labeling the polynucleotides of the sample, 

b) contacting the elements of the microarray of claim 46 with the labeled polynucleotides 
of the sample under conditions suitable for the formation of a hybridization complex, 
and 

c) quantifying the expression of the polynucleotides in the sample. 

48. An array comprising different nucleotide molecules affixed in distinct physical locations on 
a solid substrate, wherein at least one of said nucleotide molecules comprises a first oligonucleotide or 
polynucleotide sequence specifically hybridizable with at least 30 contiguous nucleotides of a target 
polynucleotide, and wherein said target polynucleotide is a polynucleotide of claim 12. 

49. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
completely complementary to at least 30 contiguous nucleotides of said target polynucleotide. 

50. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
completely complementary to at least 60 contiguous nucleotides of said target polynucleotide. 

5 1 . An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
completely complementary to said target polynucleotide. 

52. An array of claim 48, which is a microarray. 

53. An array of claim 48, further comprising said target polynucleotide hybridized to a 
nucleotide molecule comprising said first oligonucleotide or polynucleotide sequence. 
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54. An array of claim 48, wherein a linker joins at least one of said nucleotide molecules to 
said solid substrate. 

55. An array of claim 48, wherein each distinct physical location on the substrate contains 
5 multiple nucleotide molecules, and the multiple nucleotide molecules at any single distinct physical 

location have the same sequence, and each distinct physical location on the substrate contains nucleotide 
molecules having a sequence which differs from the sequence of nucleotide molecules at another distinct 
physical location on the substrate. 

10 56. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:l. 

57. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:2. 

58. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:3. 

15 

59. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:4. 

60. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:5. 
20 61 . A polypeptide of claim 1 , comprising the amino acid sequence of SEQ ID NO:6. 

62. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:7. 

63. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:8. 

25 

64. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:9. 

65. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 10. 
30 66. A polypeptide of claim 1 , comprising the amino acid sequence of SEQ ID NO: 1 1 . 

67. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:12. 

68. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 13. 
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69. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ED NO:14. 

70. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:15. 

71. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:16. 

72. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:17. 

73. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ED NO: 18. 

74. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ED NO: 19. 

75. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:20. 
15 76. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ED NO:21. 

77. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ED NO:22. 

78. A polypeptide of claim 1 , comprising the amino acid sequence of SEQ ID NO:23. 

20 

79. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:24. 

80. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:25. 
25 81. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:26. 

82. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ED NO:27. 

83. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ED NO:28. 

30 

84. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:29. 

85. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ED NO:30. 
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86. A polypeptide of claim 1 , comprising the amino acid sequence of SEQ ID NO:3 1 . 

87. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:32. 

88. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:33. 

89. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:34. 

90. A polypeptide of claim 1 , comprising the amino acid seqa&nc& of SEQ ID NO:35. 

91. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:36. 

92. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:37. 

93. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:38. 

94. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:39. 

95. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ED NO:40. 

96. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:41. 

97. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:42. 

98. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:43. 

99. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:44. 

100. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:45. 

101. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:46. 

102. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:47. 
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103. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:48. 

104. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:49. 

105. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:50. 

106. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:51. 

107. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:52. 

108. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:53. 

109. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:54. 

1 10. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:55. 

1 1 1 . A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:56. 

1 12. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:57. 

1 13. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:58. 

1 14. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:59. 

1 15. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:60. 

1 1 6. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:61 . 

1 17. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:62. 

1 18. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:63. 

1 19. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:64. 
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120. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:65. 

121. A polynucleotide of claim 12, con^rising the polynucleotide sequence of SEQ ID NO:66. 

122. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:67. 

123. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:68. 
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<150> US 60/303,442 
<151> 2001-07-06 

<150> US 60/364,438 
<151> 2002-03-12 

<160> 72 

<170> PERL Program 

<210> 1 

<211> 304 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 7490148CD1 

<400> 1 



Met 


Ser 


Arg 


Ser 


Phe 


Tyr Val 


Asp 


Ser 


Leu 


He 


He 


Lys 


Asp 


Thr 


1 








5 








10 










15 


Ser 


Arg 


Pro 


Ala 


Pro 


Ser Leu 


Pro 


Glu 


Pro 


His 


Pro 


Gly 


Pro 


Asp 










20 








25 










30 


Phe 


Pne 


He 


Pro 


Leu 


Gly Met 


Pro 


Pro 


Pro 


Leu 


Val 


Met 


Ser 


Val 










3o 








40 










45 


Ser 


Gly 


Pro 


Gly 


Cys 


Pro Ser 


Arg 


Lys 


Ser Gly 


Ala 


Phe 


Cys 


Val 










50 








55 










60 


Cys 


Pro 


Leu 


Cys 


Val 


Thr Ser 


His 


Leu 


His 


Ser 


Ser 


Arg 


Gly 


Ser 










c a 
OD 








70 










75 


Val 


Gly 


Pro 


Ala 


Ser 


Gly Gly Ala Gly 


Pro 


Gly 


Phe 


Pro 


Gly 


Pro 










oU 








85 










90 


Gly 


Asp 


Ser Gly 


Val 


Ala Gly 


Pro 


Ala 


Gly Ala 


Leu 


Pro 


Leu 


Leu 










95 








100 










105 


Lys 


Gly 


Gin 


Phe 


Ser 


Ser Ala 


Pro 


Gly 


Asp 


Ala 


Gin 


Phe 


Cys 


Pro 










110 








115 










120 


Arg 


Val 


Asn 


His 


Ala 


His His 


His 


His 


His 


Pro 


Pro 


Gin 


His 


His 










125 








130 










135 


His 


His 


His 


His 


Gin 


Pro Gin 


Gin 


Pro 


Gly 


Ser 


Ala 


Ala 


Ala 


Ala 










140 








145 










150 


Ala 


Ala 


Ala 


Ala 


Ala 


Ala Ala 


Ala 


Ala 


Ala 


Ala 


Ala 


Leu Gly His 










155 








160 










165 


Pro 


Gin 


His 


His 


Ala 


Pro Val 


Cys 


Thr 


Ala 


Thr 


Thr 


Tyr 


Asn 


Val 










170 








175 










180 


Ala 


Asp 


Pro 


Arg 


Arg 


Phe His 


Cys 


Leu 


Thr 


Met 


Gly Gly 


Ser 


Asp 










185 








190 










195 


Ala 


Ser 


Gin 


Val 


Pro 


Asn Gly Lys 


Arg Met 


Arg 


Thr 


Ala 


Phe 


Thr 










200 








205 










210 


Ser 


Thr 


Gin 


Leu 


Leu 


Glu Leu 


Glu 


Arg 


Glu 


Phe 


Ser 


Ser 


Asn 


Met 










215 








220 










225 


Tyr 


Leu 


Ser Arg 


Leu 


Arg Arg 


He 


Glu 


He 


Ala 


Thr 


Tyr 


Leu 


Asn 










230 








235 










240 


Leu 


Ser 


Glu 


Lys 


Gin 


Val Lys 


He 


Trp 


Phe 


Gin 


Asn 


Arg 


Arg 


Val 










245 








250 










255 


Lys 


His 


Lys 


Lys 


Glu 


Gly Lys Gly Thr Gin Arg 


Asn 


Ser 


His 


Ala 










260 








265 










270 


Gly 


Cys 


Lys 


Cys 


Val 


Gly Ser 


Gin 


Val 


His 


Tyr 


Ala 


Arg 


Ser 


Glu 
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275 280 285 

Asp Glu Asp Ser Leu.Ser Pro Ala Ser Ala Asn Asp Asp Lys Glu 

290 295 300 

lie Ser- Pro Leu 



<210> 2 

<211> 198 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7490301CD1 

<400> 2 



Met 


Glu 


Thr 


Gly Arg 


Gin 


Ala 


Gly Val 


Ser 


Ala 


Glu 


Met 


Phe 


Ala 


1 








5 










10 










15 


Met 


Pro 


Arg 


Asp 


Leu 


Lys 


Gly 


Ser 


Asn 


Lys 


Asp 


Gly 


He 


Pro 


Glu 










20 










25 










30 


Asd 


Leu 


Asp 


Gly Asn 


Leu 


Glu 


Glu 


Pro 


Arg 


Asp 


Gin 


Glu 


Gly 


Glu 










35 










40 










45 


Leu 


Arg 


Ser 


Glu 


Asp 


Val 


Met 


Asp 


Leu 


Thr 


Glu 


Gly 


Asp 


Asn 


Glu 










50 










55 










60 


Ala 


Ser 


Ala 


Ser 


Ala 


Pro 


Pro 


Ala 


Ala 


Lys 


Arg 


Arg 


Lys 


Thr 


Asp 










65 










70 










75 


Thr 


Lys 


Gly 


Lys 


Lys 


Glu 


Arg 


Lys 


Pro 


Thr 


Val 


Asp 


Ala 


Glu 


Glu 










80 










85 










90 


Ala 


Gin 


Arg 


Met 


Thr 


Thr 


Leu 


Leu 


Ser 


Ala 


Met 


Ser 


Glu 


Glu 


Gin 










95 










100 










105 


Leu 


Ser 


Arg 


Tyr 


Glu 


Val 


Cys 


Arg 


Arg 


Ser 


Ala 


Phe 


Pro 


Lys 


Ala 










110 










115 










120 


Cys 


He 


Ala 


Gly 


Leu 


Met 


Arg 


Ser 


He 


Thr Gly Arg 


Ser 


Val 


Ser 










125 










130 










135 


Glu 


Asn 


Val 


Ala 


He 


Ala 


Met 


Ala 


Gly 


He 


Ala 


Lys 


Val 


Phe 


Val 










140 










145 










150 


Gly 


Glu 


Val 


Val 


Glu 


Glu 


Ala 


Leu 


Asp 


Val 


Cys 


Glu 


Met 


Trp 


Gly 










155 










160 










165 


Glu 


Met 


Pro 


Pro 


Leu 


Gin 


Pro 


Lys 


His 


Leu 


Arg 


Glu 


Ala 


Val 


Arg 










170 










175 










180 


Arg 


Leu 


Lys 


Pro 


Lys 


Gly 


Leu 


Phe 


Pro 


Asn 


Ser 


Asn 


Tyr 


Lys 


Lys 










185 










190 










195 


He 


Met 


Phe 



























<210> 3 

<211> 576 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 2383223CD1 

<400> 3 

Met Asp Ser Val Ala Phe Glu Asp Val Ser Val Ser Phe Ser Gin 
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1 




D 


1U 








1 S 


Glu 


Glu Trp Ala 


Leu Leu Ala Pro Ser 


W7-Ln 


Lys 


Lys 












A D 








30 


Asp 


Val Met Gin 


Glu Thr Phe Lys Asn 


T All 


Al » 


ser 




Glu 






o o 










45 


Lys 


Trp Glu Asp 


Pro Asn Val Glu Asp 


f2l r-t 


fll S 


Lys 


noli UXll 


Gly 
















60 


Arg 


Asn Leu Arg 


Ser His Thr Gly Glu 


Arg 


Leu 


Cys 


wXU VJljf 


Lys 






c c 

O D 


•7 n 
/ u 








75 


Glu 


Gly Ser Gin 


Cys Ala Glu Asn Pne 


Ser 


Pro 


Asn 


Leu Ser 


Val 






80 










Of) 


Tlir 


Lys Lys Thr 


Ala Gly Val Lys Pro 


Tyr 


vjIU 


Cys 


rpV,-^. Tip 

jl rijr lie 


\-» _y & 








*i n n 

1UU 








1U J 


Gly 


Lys Ala Phe 


Met Arg Leu Ser Ser 


Leu 


rpl- -»- 

i nr 


Arg 


T4-> a Mot* 


Arg 






110 


lib 








1^ V 


Ser 


His Thr Gly 


Tyr Glu Leu Phe Glu 


Lys 


Pro 


Tyr 




Lys 






125 


liU 








1 -3C 
X S> 3 


Glu 


Cys Glu Lys 


Ala Phe Ser Tyr Leu 


Lys 




irne 


Gin Arg 


nib 






140 


143 








1 

XJU 


Glu 


Arg Ser His 


Thr Gly Glu Lys Pro 


Tyr 


Lys 


Cys 


T ^.TO F * | 1»-| 

jjys bin 


Cys 






155 


lb U 








IOj 


Gly 


Lys Thr Phe 


He Tyr His Gin Pro 


r*ne 


i7in 


Arg 










170 


1 / o 








lOU 


Thr 


His lie Gly 


Glu Lys Pro Tyr Glu 


Cys 


Lys 


bin 


fH^e* r"2l -i" 

Ljys oiy 


Lys 






185 


ion 
iy u 








J. y —> 


Ala 


Leu Ser Cys 


Ser Ser Ser Leu Arg 


vai 


111 s 


rjl 11 


- 

Arg lie 








200 












Thr Gly Glu Lys 


Pro Tyr Glu Cys Lys 


Gin 


Cys 


Lrly 


Lys Ala 








215 


o o n 










Ser 


Cys Ser Ser 


Ser lie Arg val ills 


r»l ii 
\j 1U 


Arg 


J. nr 


XiJ-O X IJ.J- 


Gly 






230 


O "2 c 








A *± L/ 


Glu 


Lys Pro Tyr 


Ala Cys Lys Glu Cys 


Gly 


Lys 


A 1 » 

Ala 


irne lie 


Ser 






245 


ZOU 








Z J J 


His 


Thr Ser Val 


Leu Tnr His Met lie 


mr 


riis 


Asn 




Arg 






260 


ZOO 










Pro 


Tyr Lys Cys 


Lys Glu Cys Gly Lys 


Ala 


rne 


Tl- 
119 


TJh ^ T3 "v— 

irne Jrro 


Ser 






275 


2 80 








ZOD 


Phe 


Leu Arg Val 


His Glu Arg He His 


Thr 


r*»l * t 

Gly 


blu 


Lys Pro 


Tyr 






290 


295 










Lys 


Cys Lys Gin 


Cys Gly Lys Ala Phe 


Arg 


Cys 


Ser 


V* C? 

inr ser 


lie 






305 


310 








jIj 


Gin 


He His Glu 


Arg He His Thr Gly Glu 


Lys 


Pro 


±y L. Lxy 








320 


325 








O J w 


Lys 


Glu Cys Gly 


Lys Ser Phe Ser Ala 


Arg 


Pro 


Ala 


irne Airy 


Val 






335 


340 








J4 J 


His 


Val Arg Val 


His Thr Gly Glu Lys 


Pro 


Tyr Lys 




nl it 






350 


355 








JUU 


Cys 


Gly Lys Ala 


Phe Ser Arg He Ser 


Tyr 


Phe 


Arg 


lie nis 


rtl 11 






365 


370 








_> / 3 


Arg 


Thr His Thr 


Gly Glu Lys Pro Tyr 


Glu 


Cys 


Lys 




vjxy 






380 


385 








390 


Lys 


Thr Phe Asn 


Tyr Pro Leu Asp Leu 


Lys 


He 


His 


Lys Arg 


Asn 






395 


400 








405 


His 


Thr Gly Glu 


Lys Pro Tyr Glu Cys 


Lys 


Glu 


Cys 


Ala Lys 


Thr 






410 


415 








420 


Phe 


i He Ser Leu 


Glu Asn Phe Arg Arg 


His 


Met 


He 


Thr His 


Thr 
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430 


435 




Asp 


Gly 


Pro 


Tyr 


Lys 


Cys Arg Asp Cys Gly Lys Val Phe 


He 










a a n 




445 


450 


Jrrie 


PlTO 


Ser 


Ala 


Leu 


Arg 


Thr His Glu Arg Thr His Thr Gly Glu 














O \J 


465 


Lys 


Pro 


Tyr 


(jrXU 


Cys 


Lys 


bin \jys uiy Jjys riXd xriies >->ci. ^-jr ° 


Ser 










4 / U 




f± / -J 


480 


Ser 


Tyx 


Tl — . 

116 


Arg 


He 


His 


L»y s Arg inr nis inr «iy j_»y& 


Pro 










485 




^ v> 


495 


Tyx 


Glu 


Cys 


uys 


Glu 


Cys 


ralxr T.-vrci Ala pV»<= Tie Tvr Pro Thr 


Ser 










500 








Phe 


Gin 


Gly 


TT -I _ 

HXS 


Met 


Arg 


Met His Thr Gly Glu Lys Pro Tyr 


Lys 










515 




520 


525 


cys 


jjys 




Cys 


Gly Lys 


Ala Phe Ser Leu His Ser Ser Phe 


Gin 










530 




535 


540 


Arg 


His 


Thr 


Arg 


He 


His 


Asn Tyr Glu Lys Pro Leu Glu Cys 


Lys 










545 




550 


555 


Gin 


Cys 


Gly 


Lys 


Ala 


Phe 


Ser Val Ser Thr Ser Leu Lys Lys 


His 










560 




565 


570 


Met 


Arg 


Met 


His 


Asn 


Arg 














575 









<210> 4 
<211> 426 
<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No: 3495982CD1 



<400> 4 




























Met Arg 


Arg 


Asn 


Ser 


Ser 


Leu 


Ser 


Phe 


Gin 


Met 


Glu 


Arg 


Pro 


Leu 


1 






5 










10 










15 


Glu Glu 


Gin 


Val 


Gin 


Ser 


Lys 


Trp 


Ser 


Ser 


Ser 


Gin 


Gly 


Arg 


Thr 








20 










25 










30 


Gly Thr 


Gly 


Gly 


Ser 


Asp 


Val 


Leu 


Gin 


Met 


Gin 


Asn 


Ser 


Glu 


His 








35 










40 










45 


His Gly 


Gin 


Ser 


He 


Lys 


Thr 


Gin Thr Asp 


Ser 


lie 


Seir 


Leu 


Glu 








50 










55 










60 


Asp Val 


Ala 


Val 


Asn 


Phe 


Thr 


Leu 


Glu 


Glu 


Trp 


Ala 


Leu 


Leu 


Asp 








65 










70 










75 


Pro Gly 


Gin 


Arg 


Asn 


He 


Tyr 


Arg 


Asp 


Val 


Met 


Arg 


Ala 


Thr 


Phe 








80 










85 










90 


Lys Asn 


Leu 


Ala 


Cys 


He Gly 


Glu 


Lys 


Trp 


Lys 


Asp 


Gin 


Asp 


He 








95 










100 










105 


Glu Asp 


Glu 


His 


Lys 


Asn 


Gin 


Gly Arg 


Asn 


Leu 


Arg 


Ser 


Pro 


Met 








110 










115 










120 


Val Glu 


Ala 


Leu 


Cys 


Glu 


Asn 


Lys 


Glu 


Asp 


Cys 


Pro 


Cys 


Gly 


Lys 








125 










130 










135 


Ser Thr 


Ser 


Gin 


He 


Pro 


Asp 


Leu 


Asn 


Thr 


Asn 


Leu 


Glu 


Thr 


Pro 








140 










145 










150 


Thr Gly 


Leu 


Lys 


Pro 


Cys 


Asp 


Cys 


Ser 


Val 


Cys 


Gly 


Glu 


Val 


Phe 








155 










160 










165 


Met His 


Gin 


Val 


Ser 


Leu 


Asn 


Arg 


His 


Met 


Arg 


Ser 


His 


Thr 


Glu 



170 175 180 
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Gin Lys 


Pro 


Asn Glu 


Cys 


His Glu Tyr Gly Glu 


Lys 


Pro 


His 


Lvs 






185 








190 










195 


Cys Lys 


pi 


Cys Gly Lys 


Thr Phe 


Thr 


Arg 


Ser 


Ser 


Ser 


He 


Arg 






200 








205 










210 


Thr His 


L?XU 


Arg He 


His 


Thr Gly Glu 


Lys 


Pro 


Tvr 


Glu 


Cvs 


Lys 






215 








220 










225 


Glu Cys 


Gly 


Lys Ala 


Phe 


jvl a r>Vi a 

A±ci irxie 


Leu 


IT A ICS 


Ser 


Phe 


Arg 


Asn 


His 






230 








235 










240 


lie Arg 


He 


His Thr Gly 


vjiu inr 


Pro 


j.yjr 


Glu 


Cvs 


Lvs 


Glu 


Cys 






245 








£t J \J 










255 


Gly Lys 


Ala 


Phe Arg 


Tyr 


Leu Thr 


Ala 


Leu 


Arg 


riJ - y 


His 


Glu 


Lvs 




260 








265 










270 


Asn His 


Thr Gly Glu 


Lys 


Pro Tyr 


Lys 


Cys 


Lys 


Gin 


Cys 


Glv 


Lvs 






275 








280 










285 


Ala Phe 


He 


Tyr Tyr 


/si „ 

Gin 


irro rile 


Leu 


Thr 


His 


Glu 


Arg 


Thr 


His 






290 








295 










300 


Thr Gly 


Glu 


Lys Pro 


Tyr 


Glu Cys 


Lys 


Gin 


Cys 


Gly 


Lys 


Ala 


Phe 






305 








310 










315 


Ser Cys 


Pro 


Thr Tyr 


Leu 


Arg Ser 


His 


Glu 


Lys 


Thr 


His 


Thr 


Glv 






320 








325 










330 


Glu Lys 


Pro 


Phe Val 


Cys 


Arg Glu 


Cys 


Gly Arg 


Ala 


Phe 


Phe 


Ser 




335 








340 










345 


His Ser 


Ser 


Leu Arg 


Lys 


His Val 


Ser 


His 


His 


Thr 


Arg 


Pro 


Pro 






350 








355 










360 


Val Leu 


Phe 


Phe Phe 


Phe 


Glu Thr 


Glu 


Ser 


Leu 


Pro 


Arg 


Leu 


Glu 






365 








370 










375 


Cys Ser Gly Ala lie 


Ser 


Ala Tyr Cys 


Lys 


Leu 


Arg 


Leu 


Leu 


Gly 






380 








385 










390 


Ser Arg 


His 


Ser Pro 


Ala 


Ser Ala 


Ser 


Arg 


Val 


Ala 


Gly Thr 


Thr 






395 








400 










405 


Gly Ala 


Arg 


His His 


Ala 


Arg Leu 


He 


Phe 


Cys 


He 


Phe 


Ser 


Gly 






410 








415 










420 


Asp Gly Val 


Ser Pro 


Cys 






















425 





















<210> 5 

<211> 786 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 7477891CD1 

<400> 5 

Met Ala Asn Asn Tyr Lys Lys He Val Leu Leu Lys Gly Leu Glu 

15 10 15 

Val He Asn Asp Tyr His Phe Arg He Val Lys Ser Leu Leu Ser 

20 25 30 

Asn Asp Leu Lys Leu Asn Pro Lys Met Lys Glu Glu Tyr Asp Lys 

35 40 45 

He Gin He Ala Asp Leu Met Glu Glu Lys Phe Pro Gly Asp Ala 

50 55 60 

Gly Leu Gly Lys Leu He Glu Phe Phe Lys Glu He Pro Thr Leu 

65 70 75 

Gly Asp Leu Ala Glu Thr Leu Lys Arg Glu Lys Leu Lys Val Lys 
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80 










85 








90 


Gly 


He 


He 


Pro 


Ser 
95 


Lys 


Lys 


Thr 


Lys 


Gin 
100 


Lys 


Glu 


Val Tyr 


Pro 
105 


Ala 


Thr 


Pro 


Ala 


Cys 
110 


Thr 


Pro 


Ser 


Asn 


Arg 
115 


Leu 


Thr 


Ala Lys 


Gly 
120 


Ala 


Glu 


Glu 


Thr 


Leu 
125 


Gly 


Pro 


Gin 


Lys 


Arg 
130 


Lys 


Lys 


Pro Ser 


Glu 
135 


Glu 


Glu 


Thr 


Gly 


Thr 
140 


Lys 


Arg 


Ser 


Lys 


Met 
145 


Ser 


Lys 


Glu Gin 


Thr 
150 


Arg 


Pro 


Ser 


Cys 


Ser 


Ala 


Gly Ala 


Ser 


Thr 


Ser 


Thr 


Ala Met 


Gly 










j — > ~> 










160 








165 


Arg 


Ser 


Pro 


Pro 


Pro 


Gin 


Thr 


Ser 


Ser 


Ser 
1 75 


Ala 


Pro 


Pro Asn 


Thr 
180 


Ser 


Ser 


Thr 


Glu 


Ser 

X O -J 


Leu 


Lys 


Pro 


Leu 


Ala 
190 


Asn 


Arg 


His Ala 


Thr 
195 


Ala 


Ser 


Lys 


Asn 


He 
9 on 

£t \J \J 


Phe 


Arg 


Glu 


Asp 


Pro 
205 


He 


He 


Ala Met 


Val 
210 


Leu 


Asn 


Ala 


Thr 


Lys 

J — J 


Val 


Phe 


Lys 


Tyr 


Glu 
220 


Ser 


Ser 


Glu Asn 


Glu 

225 


Gin 


Arg 


Arg 


Met 


Phe 

£» -J \J 


His 


Aid 


inr 


Val 


Ala 
235 


Thr 


Gin 


Thr Gin 


Phe 
240 


Phe 


His 


Val 


Lys 


Val 

*± J 


Leu 


Asn 


He 


Asn 


Leu 
250 


Lys 


Arg 


Lys Phe 


He 
255 


Lys 


Lys 


Arg 


He 


He 
2 60 


He 


He 


Ser 


Asn 


Tyr 
265 


Ser 


Lys 


Arg Asn 


Ser 
270 


Leu 


Leu 


Glu 


Val 


Asn 
971; 


Glu 


Ala 


Ser 


Ser 


Val 
280 


Ser 


Glu 


Ala Gly 


Pro 
285 


As]^ 


Gin 


Thr 


Phe 


Glu 

a. y vj 


Val 


Pro 


Lys 


Asp 


He 

£a Z? -J 


He 


Arg 


Arg Ala 


Lys 
300 


Lys 


Tl <=> 

x±e 


Pro 


Lys 


lie 

J U -J 


Asn 


He 


Leu 


His 


Lys 


Gin 


Thr 


Ser Gly 


Tvr 
315 


He 


Val 


Tyr 


Gly 


Leu 


Phe 


Met 


Leu 


His 


Thr 

~j — > 


Lys 


He 


Val Asn 


Arg 
330 


Lys 


Thr 


Thr 


He 


Tyr 


Glu 


He 


Gin 


Asp 


Lys 


Thr 


Gly 


Ser Met 


Ala 
345 


Val 


Val 


Gly Lys 


Gly 


Glu 


Cys 


His 


Asn 


He 


Pro 


Cys 


Glu Lys 


Gly 




















355 








360 


Asp 


Lys 


Leu 


Arg 


Leu 
"3 R 

J D J 


Phe 


Cys 


Phe 


Arg 


Leu 
370 


Arg 


Lys 


Arg Glu 


Asn 
375 


Met 


Ser 


Lys 


Leu 


Met 
"3RD 


Ser 


blu 


i v iet. 


His 


Ser 
385 


Phe 


He 


Gin He 


Gin 
390 


Lys 


Asn 


Thr 


Asn 


-D J —) 


Arg 


Ser 


His 




Ser 
400 


Arg 


Ser 


Met Ala 


Leu 
405 


Pro 


Gin 


Glu 


Gin 


Ser 


Gin 


His 


Pro 


Lys 


Pro 
415 


Ser 


Glu 


Ala Ser 


Thr 
420 


Thr 


Leu 


Pro 


Glu 


Ser 
*± — > 


His 


Leu 


Lys 


Thr 


Pro 
430 


Gin 


Met 


Pro Pro 


Thr 
435 


Thr 


Pro 


Ser 


Ser 


Ser 


Phe 


Phe 


Thr 


Lys 


Lys 


Ser 


Glu Asp Thr 


He 










f± *± VJ 










4.45 








450 




J_)_y & 


Met 


Asn 


Asp 
455 


Phe 


Met 


Arg 


Met 


Gin 
460 


lie 


Leu 


Lys Glu 


Gly 
465 


Ser 


His 


Phe 


Pro 


Gly 
470 


Pro 


Phe 


Met 


Thr 


Ser 
475 


He 


Gly 


Pro Ala 


Glu 
480 


Ser 


His 


Pro 


His 


Thr 
485 


Pro 


Gin 


Met 


Pro 


Pro 
490 


Ser 


Thr 


Pro Ser 


Ser 
495 


Ser 


Phe 


Leu 


Thr 


Thr 


Lys 


Ser 


Glu 


Asp 


Thr 


He 


Ser 


Lys Met 


Asn 
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500 








505 










510 


Asp 


Phe 


Met 


Arg 


Met 


Gin 


lie Leu 


Lys 


Glu 


Gly 


Ser 


His 


Phe 


Pro 










515 








520 










525 


Gly Pro 


Phe 


Met 


Thr 


Ser 


He Gly Pro 


Ala 


Glu 


Ser 


His 


Pro 


His 










530 








535 










540 


Thr 


Pro 


Gin 


Met 


Pro 
545 


Pro 


Ser Thr 


Pro 


Ser 
550 


Ser 


Ser 


Phe 


Leu 


Thr 
555 


Thr 


Leu 


Lys 


Pro 


Arg 
560 


Leu 


Lys Thr 


Glu 


Pro 
565 


Glu 


Glu 


Val 


Ser 


He 
570 


Glu 


Ast) 


Ser 


Ala 


Gin 
575 


Ser 


Asp Leu 


Lys 


Glu 
580 


Val 


Met 


Val 


Leu 


Asn 
585 


Ala 


Thr 


Glu 


Ser 


Phe 
590 


Val 


Tyr Glu 


Pro 


Lys 
595 


Glu 


Gin 


Lys 


Lys 


Met 
600 


Phe 


His 


Ala 


Thr 


Val 


Ala 


Thr Glu 


Asn 


Glu 


Val 


Phe Arg Val 


i»y s 










605 








610 










615 


Val 


Phe 


Asn 


He 


Asp 
620 


Leu 


Lys Glu 


Lys 


Phe 
625 


Thr 


Pro 


Lys 


Lys 


He 
630 


lie 


Ala 


He 


Ala 


Asn 
635 


Tyr 


Val Cys 


Arg 


Asn 
640 


Gly 


Phe 


Leu 


Glu 


Val 
645 


Tyr 


Pro 


Phe 


Thr 


Leu 


Val 


Ala Asp Val 


Asn 


Ala 


Asp 


Arg 


Asn 


Met 










650 








655 










660 


Glu 


lie 


Pro 


Lys 


Gly 
665 


Leu 


He Arg 


Ser 


Ala 
670 


Ser 


Val 


Thr 


Pro 


Lys 
675 


lie 


Asn 


Gin 


Leu 


Cys 


Ser 


Gin Thr 


Lys 


Gly 


Ser 


Phe 


Val 


Asn Gly 










680 








685 










690 


Val 


Phe 


Glu 


Val 


His 
695 


Lys 


Lys Asn 


Val 


Arg 
700 


Gly 


Glu 


Phe 


Thr 


Tyr 
705 


Tyr 


Glu 


He 


Gin 


Asp 


Asn 


Thr Gly Lys 


Met 


Glu 


Val 


Val 


Val 


His 








710 








715 










720 


Gly Arg 


Leu 


Thr 


Thr 


He 


Asn Cys 


Glu 


Glu Gly Asp 


Lys 


Leu 


Lys 










725 








730 










735 


Leu 


Thr 


Cys 


Phe 


Glu 


Leu 


Ala Pro 


Lys 


Ser 


Gly 


Asn Thr Gly Glu 










740 








745 










750 


Leu 


Arg 


Ser 


Val 


He 
755 


His 


Ser His 


He 


Lys 
760 


Val 


He 


Lys 


Thr 


Arg 
765 


Lys 


Asn 


Lys 


Lys 


Asp 
770 


He 


Leu Asn 


Pro 


Asp 
775 


Ser 


Ser 


Met 


Glu 


Thr 
780 


Ser 


Pro 


Asp 


Phe 


Phe 
785 


Phe 



















<210> 6 

<211> 617 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc__f eature 

<223> Incyte ID No: 72688352CD1 

<400> 6 

Met He Lys Ser Gin Glu Ser Leu Thr Leu Glu Asp Val Ala Val 

15 10 15 

Glu Phe Thr Trp Glu Glu Trp Gin Leu Leu Gly Pro Ala Gin Lys 

20 25 30 
Asp Leu Tyr Arg Asp Val Met Leu Glu Asn Tyr Ser Asn Leu Val 

35 40 45 
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Ser Val Gly Tyr Gin Ala Ser Lys Pro Asp Ala Leu Phe Lys Leu 
50 55 60 

Glu Gin Gly Glu Pro Trp Thr Val Glu Asn Glu He His Ser Gin 
65 70 75 

He Cys Pro Glu He Lys Lys Val Asp Asn His Leu Gin Met His 
80 85 90 

Ser Gin Lys Gin Arg Cys Leu Lys Arg Val Glu Gin Cys His Lys 
95 100 105 

His Asn Ala Phe Gly Asn He He His Gin Arg Lys Ser Asp Phe 
110 115 . 120 

Pro Leu Arg Gin Asn His Asp Thr Phe Asp Leu His Gly Lys He 
125 130 135 

Leu Lys Ser Asn Leu Ser Leu Val Asn Gin Asn Lys Arg Tyr Glu 
140 145 150 

He Lys Asn Ser Val Gly Val Asn Gly Asp Gly Lys Ser Phe Leu 
155 160 165 

His Ala Lys His Glu Gin Phe His Asn Glu Met Asn Phe Pro Glu 
170 175 180 

Gly Gly Asn Ser Val Asn Thr Asn Ser Gin Phe He Lys His Gin 
185 190 195 

Arg Thr Gin Asn He Asp Lys Pro His Val Cys Thr Glu Cys Gly 
200 205 210 

Lys Ala Phe Leu Lys Lys Ser Arg Leu He Tyr His Gin Arg Val 
215 220 225 

His Thr Gly Glu Lys Pro His Gly Cys Ser He Cys Gly Lys Ala 
230 235 240 

Phe Ser Arg Lys Ser Gly Leu Thr Glu His Gin Arg Asn His Thr 
245 250 255 

Gly Glu Lys Pro Tyr Glu Cys Thr Glu Cys Asp Lys Ala Phe Arg 
260 265 270 

Trp Lys Ser Gin Leu Asn Ala His Gin Lys He His Thr Gly Glu 
275 280 285 

Lys Ser Tyr He Cys Ser Asp Cys Gly Lys Gly Phe He Lys Lys 
290 295 300 

Ser Arg Leu He Asn His Gin Arg Val His Thr Gly Glu Lys Pro 
305 310 315 

His Gly Cys Ser Leu Cys Gly Lys Ala Phe Ser Lys Arg Ser Arg 
320 325 330 

Leu Thr Glu His Gin Arg Thr His Thr Gly Glu Lys Pro Tyr Glu 
335 340 * 345 

Cys Thr Glu Cys Asp Lys Ala Phe Arg Trp Lys Ser Gin Leu Asn 
350 355 360 

Ala His Gin Lys Ala His Thr Gly Glu Lys Ser Tyr He Cys Arg 
365 370 375 

Asp Cys Gly Lys Gly Phe He Gin Lys Gly Asn Leu He Val His 
380 385 * 390 

Gin Arg He His Thr Gly Glu Lys Pro Tyr He Cys Asn Glu Cys 
395 400 405 

Gly Lys Gly Phe He Gin Lys Gly Asn Leu Leu He His Arg Arg 
410 415 420 

Thr His Thr Gly Glu Lys Pro Tyr Val Cys Asn Glu Cys Gly Lys 
425 430 435 

Gly Phe Ser Gin Lys Thr Cys Leu He Ser His Gin Arg Phe His 
440 445 450 

Thr Gly Lys Thr Pro Phe Val Cys Thr Glu Cys Gly Lys Ser Cys 
455 460 465 
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O CI 


His 


Lys 


Ser 


Gly Leu 


lie Asn 


His Gin Arg 


lie His 


Thr 


Gly 










470 






475 






480 




Lys 


Pro 




XilX 


CyS 


Ser Asp Cys Gly Lys 


Ala Phe 


Ara 


Asp 










485 






490 






495 


Lys 


Ser 


Cys 


Leu 


Asn 


Arg 


His Arg 


Arg Thr His 


Thr Gly 


Glu 


Arcr 










500 






505 






510 


riO 


Tyr Gly 




Ser 


Asp 


Cys Gly Lys Ala Phe Ser His 


Leu 


Ser 










c: i c 
— > j — > 






520 






525 


Cys 


Leu 


Val 


Tyr 


n _L o 


Lys 


Gly Met 


Leu His Ala 


Arg Glu 




Cys 










J Ju 






535 






540 


vaJL 


Gly 


Ser 


Val 


Lys 


Leu 


Glu Asn 


Pro Cys Ser 


Glu Ser 


His 


Ser 










R/m 
j^j 






550 






555 


Leu. 


Ser 


His 


Tnr 


Arg 


Asp 


Leu lie 


Gin Asp Lys 


Asp Ser 


Val 


Asn 










^ g n 






565 






570 


rie u 


Val 


Thr 


Leu. 


Gin 


Met 


Pro Ser 


Val Ala Ala 


Gin Thr 


Ser 


Leu 










575 






580 






585 


Thr 


Asn 


Ser 


Ala 


Phe 


Gin 


Ala Glu 


Ser Lys Val 


Ala He 


Val 


Ser 










590 






595 






600 


Gin 


Pro 


Val 


Ala 


Arg 


Ser 


Ser Val 


Ser Ala Asp 


Ser Arg 


He 


Cys 










605 






610 






615 


Thr 


Glu 





















<210> 7 
<211> 249 
<212> PRT 

<213> Homo sapiens 



<220> 

<221> mis cofeature 

<223> Incyte ID No: 7490652CD1 



<400> 7 


























Met Ala 


Val 


Gly 


Lys 


Asn 


Lys 


His 


Leu Met 


Lys 


Gly Gly 


Lys 


Lys 


1 






5 








10 










15 


Gly Ala Glu 


Asn 


Arg 


Val 


Val 


Asp 


Pro Phe 


Ser Lys 


Lys 


Asp 


Trp 








20 








25 










30 


Cys Asp Val 


Lys 


Ala 


Leu 


Ala 


Met 


Phe Asn 


He 


Arg 


Asn 


He 


Gly 








35 








40 










45 


Glu Thr 


Leu 


Val 


Thr 


Arg 


Thr 


Arg 


Gly Thr 


Lys 


He 


Ala 


Ser 


Asp 








50 








55 










60 


Ser Leu 


Lys 


Arg 


Arg 


Val 


Phe 


Glu 


Val Ser 


Leu 


Ala 


Asp 


Leu 


Gin 








65 








70 










75 


Asn Asp 


Glu 


Val 


Ala 


Phe 


Arg 


Lys 


Phe Lys 


Leu 


He 


Ala 


Glu 


Asp 








80 








85 










90 


Val Gin 


Lys 


Lys 


Thr 


Asn 


Phe 


Gin 


Gly Met Asp Leu 


Pro 


Asp 


Glu 








95 








100 










105 


Met Cys 


Ser 


Val 


Val 


Lys 


Lys 


Trp 


Gin Thr 


Met 


He 


Glu 


Pro 


His 








110 








115 










120 


He Asp 


Val 


Lys 


Thr 


Thr 


Asp 


Gly Tyr Leu 


Phe 


His 


Leu 


Leu 


Cys 








125 








130 










135 


Asp Phe 


Thr 


Lys 


Lys 


His 


Asn 


Leu 


He Gin 


Lys 


Ala 


Ser 


Tyr 


Ala 








140 








145 










150 


Gin His 


Gin 


Gin 


Val 


Cys 


Glu 


He 


Gin Lys 


Lys 


Met 


Met 


Glu 


He 








155 








160 










165 


Met Thr 


Lys 


Gly Ala Asn 


Asp 


Leu 


Lys Glu 


Val 


Val 


Asn 


Lys 


Leu 
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170 










175 








180 


lie 


■p-»— o nsT ~\7 ot" 


Thr 


Glv 


Lys 


Glu 


Lys 


Leu 


Cvs 


Leu 


Ser 


lie Tyr 






185 










190 








195 


Leu 


Leu His Asp 


Val 


Phe 


Val 


Arg 


Lys 


Val 


Lys 


Met 


Leu 


Lys Met 






200 










205 








210 


Pro 


Lys Phe Asp 


Leu 


Gly 


Lys 


Phe 


Met 


Gly 


Asn 


Cys 


Ser 


Gly Lys 






215 










220 








225 


Ala 


Thr Gly Asp 


Glu 


Thr 


Gly Ala 


Lys 


Val 


Glu 


Leu 


Ala 


Asp Gly 






230 










235 








240 


Tyr 


Glu Ala Leu 


Val 


Gin 


Glu 


Ser 


Val 













245 



<210> 8 

<211> 384 

<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc_ f eature 

<223> Incyte ID No: 7489744CD1 



<400> 8 



Met 


Glu 


Val 


lie 


TT_ T 

Val 


Glu 


Asn 


Leu 


His 


Leu 


Pro 


Thr 


Ser 


p-rn Tip 


1 








r- 

b 










10 










Pro 


Pro 


Val 


Ala 


Gly 


Ala 


Glu 


Ser Gly 


Pro 


Gin 


Arg 


Ala 


Leu Ser 










20 










25 










Ser 


Pro 


Thr 


Ala 


Ala 


Ala 


Gly 


Leu 


Val 


Thr 


He 


Thr 


Jtrro 


Arg vj-lu 










35 










40 










Glu 


Pro 


Gin 


Leu 


Pro 


Gin 


Pro 


Ala 


Pro 


Val 


Thr 


He 


Thr 


Ala Thr 










50 










55 








60 


Met 


Ser 


Ser 


Glu 


Ala 


Glu 


Thr 


Gin 


Gin 


Pro 


Pro 


Ala 


Ala 


Pro Pro 










65 










70 








75 


Ala 


Ala 


Pro 


Ala 


Leu 


Ser 


Ala 


Ala 


Asp 


Thr 


Lys 


Pro 


Gly 


Thr Thr 










80 










85 








90 


Gly 


Ser 


Gly Ala 


Gly 


Ser 


Gly 


Gly 


Pro 


Gly 


Gly 


Leu 


Thr 


Ser Ala 










95 










100 








105 


Ala 


Pro 


Ala 


Gly 


Gly 


Asp 


Lys 


Lys 


Val 


lie 


Ala 


Thr 


Lys 


Val Leu 










110 










115 








120 


Gly Thr Val 


Lys 


Trp 


Phe 


Asn 


Val 


Arg 


Asn 


Gly Tyr 


Gly 


Phe He 










125 










130 








135 


Asn 


Arg 


Asn 


Asp 


Thr 


Lys 


Glu 


Asp 


Val 


Phe 


Val 


His 


Gin 


Thr Ala 










140 










145 








150 


lie 


Lys 


Lys 


Asn 


Asn 


Pro 


Arg 


Lys 


Tyr 


Leu 


Arg 


Ser 


Val 


Gly Asp 










155 










160 








165 


Gly Glu 


Thr 


Val 


Glu 


Phe 


Asp 


Val 


Val 


Glu 


Gly 


Glu 


Lys 


Gly Ala 










170 










175 








180 


Glu 


Ala 


Ala 


Asn 


Val 


Thr 


Gly 


Pro 


Gly Gly Val 


Pro 


Val 


Gin Gly 










185 










190 








195 


Ser 


Lys 


Tyr 


Ala 


Ala 


Asp 


Arg 


Asn 


His 


Tyr 


Arg 


Arg 


Tyr 


Pro Arg 










200 










205 








210 


Arg 


Arg 


Gly 


Pro 


Pro 


Arg 


Asn 


Tyr 


Gin 


Gin 


Asn 


Tyr 


Gin 


Asn Ser 










215 










220 








225 


Glu 


Ser 


Gly Glu 


Lys 


Asn 


Glu 


Gly 


Ser 


Glu 


Ser 


Ala 


Pro 


Glu Gly 










230 










235 








240 


Gin 


Ala 


Gin 


Gin 


Arg 


Arg 


Pro 


Tyr Arg 


Arg 


Arg 


Arg 


Phe 


Pro Pro 










245 










250 








255 
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r P\nr 




Mpf 

nt- 1— 


Axg 


Arg pro 


Tyr Gly Arg Arg Pro Gin 


Tyr Ser Asn 










^ O \J 


265 


270 


XT-L KJ 


X. <-/ 




Gin 


Glv Glu 


Val Met Glu Gly Ala Asp 


Asn Gin Gly 










275 


280 


285 


Ala 


Gly 


Glu. 


Gin 


Gly Arg 


Pro Val Arg Gin Asn Met 


Tyr Arg Gly 










290 


295 


300 


oryx 


Arg 


Pro 


Arg 


Phe Arg 


Arg Gly Pro Pro Arg Gin 


Arg Gin Pro 










305 


310 


315 


Arg 




Asp 




A9il V31U 


Glu Asp Lys Glu Asn Gin Gly Asp Glu 










~? VJ 


325 


330 


TTrr 

1 XXX 




Gly 


Gin 


Gin Pro 


Pro Gin Arg Arg Tyr Arg 


Arg Asn Phe 










"3 *5 


340 


345 


Asn 


Tyr 


Arg 


Arg 


Arg Arg 


Pro Glu Asn Pro Lys Pro 


Gin Asp Gly 










350 


355 


360 


Gin 


Glu 


Thr 


Lys 


Ala Ala 


Asp Pro Pro Ala Glu Asn 


Ser Ser Ala 










365 


370 


375 


Pro 


Glu 


Ala 


Glu 


Gin Gly 


Gly Ala Glu 












380 







<210> 9 
<211> 312 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 3363382CD1 

<400> 9 



Met 


Ala 


Asp 


Gly Asp Ser 


Gly 


Ser Glu Arg Gly Gly Gly Gly Gly 


1 








5 






10 








15 


Pro 


Cys 


Gly 


Phe 


Gin Pro 


Ala 


Ser Arg 


Gly 


Gly 


Gly 


Glu 


Gin Glu 










20 






25 








30 


Thr 


Gin 


Glu 


Leu 


Ala Ser 


Lys 


Arg Leu 


Asp 


He 


Gin 


Asn 


Lys Arg 










35 






40 








45 


Phe 


Tyr 


Leu 


Asp 


Val Lys 


Gin 


Asn Ala 


Lys 


Gly 


Arg 


Phe 


Leu Lys 










50 






55 








60 


He 


Ala 


Glu 


Val 


Gly Ala 


Gly 


Gly Ser Lys 


Ser 


Arg 


Leu 


Thr Leu 










65 






70 








75 


Ser 


Met 


Ala 


Val 


Ala Ala 


Glu 


Phe Arg 


Asp 


Ser 


Leu 


Gly 


Asp Phe 










80 






85 








90 


He 


Glu 


His 


Tyr 


Ala Gin 


Leu 


Gly Pro 


Ser 


Ser 


Pro 


Glu 


Gin Leu 










95 






100 








105 


Ala 


Ala 


Gly Ala 


Glu Glu 


Gly 


Gly Gly Pro 


Arg 


Arg 


Ala 


Leu Lys 










110 






115 








120 


Ser 


Glu 


Phe 


Leu 


Val Arg 


Glu 


Asn Arg 


Lys 


Tyr 


Tyr 


Leu 


Asp Leu 










125 






130 








135 


Lys 


Glu 


Asn 


Gin 


Arg Gly 


Arg 


Phe Leu 


Arg 


He 


Arg 


Gin 


Thr Val 










140 






145 








150 


Asn 


Arg 


Gly Gly Gly Gly 


Phe 


Gly Ala 


Gly 


Pro 


Gly 


Pro 


Gly Gly 










155 






160 








165 


Leu 


Gin 


Ser Gly Gin Thr 


He 


Ala Leu 


Pro 


Ala 


Gin 


Gly 


Leu He 










170 






175 








180 


Glu 


Phe 


Arg 


Asp 


Ala Leu 


Ala 


Lys Leu 


He 


Asp 


Asp 


Tyr Gly Gly 










185 






190 








195 


Glu 


Asp 


Asp 


Glu 


Leu Ala 


Gly 


Gly Pro 


Gly 


Gly 


Gly 


Ala 


Gly Gly 
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200 








205 










210 


Pro 


Gly 


Gly 


Gly 


Leu 


Tyr Gly 


Glu 


Leu 


Pro 


Glu 


Gly Thr 


Ser 


He 










215 








220 










225 


Thr 


Val 


Asp 


Ser 


Lys 
230 


Arg Phe 


Phe 


Phe 


Asp 
235 


Val 


Gly 


Cys 


Asn 


Lys 
240 


Tyr Gly 


Val 


Phe 


Leu 


Arg Val 


Ser 


Glu 


Val 


Lys 


Pro 


Ser 


Tyr 


Arg 








• 


245 








250 










255 


Asn 


Ala 


lie 


Thr 


Val 


Pro Phe 


Lys 


Ala 


Trp 


Gly 


Lys 


Phe 


Gly Gly 










260 








265 










270 


Ala 


Phe 


Cys 


Arg 


Tyr 
275 


Ala Asp 


Glu 


Met 


Lys 
280 


Glu 


He 


Gin 


Glu 


Arg 
285 


Gin 


Arg 


Asp 


Lys 


Leu 


Tyr Glu 


Arg 


Arg 


Gly Gly Gly 


Ser 


Gly 


Gly 










290 








295 










300 


Gly 


Glu 


Glu 


Ser 


Glu 
305 


Gly Glu 


Glu 


Val 


Asp 
310 


Glu 


Asp 









<210> 10 
<211> 441 
<212> PRT 

<213> Homo sapiens 



<220> 

<221> mi sc_ feature 

<223> Incyte ID No: 7491148CD1 



<400> 10 



Met Lys 


Asp 


His 


Asp 


Ala 


He 


Lys 


Leu 


Phe 


Val 


Gly Gin 


He 


Pro 


1 






5 










10 










15 


Arg Gly 


Leu 


Asp 


Glu 
20 


Gin 


Asp 


Leu 


Lys 


Pro 

25 


Leu 


Phe 


Glu 


Glu 


Phe 
30 


Gly Arg 


He 


Tyr 


Glu 

35 


Leu 


Thr 


Val 


Leu 


Lys 
40 


Asp 


Arg 


Leu 


Thr' 


Gly 
45 


Leu His 


Lys 


Gly 


Cys 


Ala 


Phe Leu 


Thr 


Tyr 


Cys 


Ala 


Arg 


Asp 


Ser 








50 










55 










60 


Ala Leu 


Lys 


Ala 


Gin 


Ser 


Ala 


Leu 


His 


Glu 


Gin 


Lys 


Thr 


Leu 


Pro 






65 










70 










75 


Gly Phe 


His 


He 


Leu 
80 


Asn 


Asn 


Asn 


Asn 


Asn 
85 


Asn 


Lys 


Asn 


Arg 


Pro 
90 


Glu Asp 


Arg 


Lys 


Leu 


Phe 


Val 


Gly Met 


Leu 


Gly 


Lys 


Gin 


Gin 


Gly 








95 










100 










105 


Glu Glu 


Asp 


Val 


Arg 


Arg 


Leu 


Phe 


Gin 


Pro 


Phe Gly His 


He 


Glu 








110 










115 










120 


Glu Cys 


Thr 


Val 


Leu 


Arg 


Ser 


Pro 


Asp 


Gly Thr 


Ser 


Lys 


Gly 


Cys 








125 










130 










135 


Ala Phe 


Val 


Lys 


Phe 
140 


Gly 


Ser 


Gin 


Gly 


Glu 
145 


Ala 


Gin 


Ala 


Ala 


He 
150 


Arg Gly Leu 


His 


Gly 


Ser 


Arg 


Thr 


Met 


Ala 


Gly Ala 


Ser 


Ser 


Ser 








155 










160 










165 


Leu Val 


Val 


Lys 


Leu 
170 


Ala 


Asp 


Thr 


Asp 


Arg 
175 


Glu 


Arg 


Ala 


Leu 


Arg 
180 


Arg Met 


Gin 


Gin 


Met 


Ala Gly His 


Leu 


Gly 


Ala 


Phe 


His 


Pro 


Ala 








185 










190 










195 


Pro Leu 


Pro 


Leu 


Gly Ala 


Cys 


Gly Ala 


Tyr 


Thr 


Thr 


Ala 


He 


Leu 








200 










205 










210 


Gin His 


Gin 


Ala 


Ala 
215 


Leu 


Leu 


Ala 


Ala 


Ala 
220 


Gin 


Gly 


Pro 


Gly 


Leu 
225 
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Gly 


Pro 


Val 


Ala 


Ala 
230 


Val Ala 


Ala 


Gin 


Met Gin 
235 


His 


Val 


Ala 


Ala 
240 


Phe 


Ser 


Leu 


Val 


Ala 
245 


Ala Pro 


Leu 


Leu 


Pro Ala 
250 


Ala 


Ala 


Ala 


Asn 
255 


Ser 


Pro 


Pro 


Gly 


Ser 


Gly Pro 


Gly 


Thr 


Leu Pro Gly Leu 


Pro 


Ala 










260 








265 








270 


Pro 


lie 


Gly 


Val 


Asn 
275 


Gly Val 


Arg 


Pro 


Ser Asp 
280 


Thr 


Pro 


Arg 


Ser 
285 


Asn 


Gly Gin Pro Gly Ser Asp Thr Leu Tyr Asn Asn Gly I*eu 


Ser 










290 








295 








300 


Pro 


Tyr 


Pro 


Ala 


Gin 


Ser Pro 


Gly Val 




Pro 


Leu 


Gin 


Gin 










305 








310 








315 


Ala 


iyr 


Ala 


Gly Met 


His His 


Tyr 


Ala 


Ala Ala 


Tyr 


Pro 


Ser 


Ala 










320 








325 








330 


Tyr 


Ala 


Pro 


Val 


Ser 


inr Aia 


Phe 


Pro 


w JLll VJX11 


Pro 


Ser 


Ala 


Leu 








335 








340 








345 


Pro 


Gin 


Gin 


Gin 


Arg 
350 


Glu Gly 


Pro 


Glu 


Gly Cys 
355 


Asn 


Leu 


Phe 


lie 
360 


Tyr 


His 


Leu 


Pro 


Gin 
365 


Glu Phe 


Gly 


Asp 


Ala Glu 
370 


Leu 


lie 


Gin 


Thr 
375 


Phe 


Leu 


Pro 


Phe 


Gly 
380 


Ala Val 


Val 


Ser 


Ala Lys 
385 


Val 


Phe 


Val 


Asp 
390 


Arg 


Ala 


Thr 


Asn 


Gin 
395 


Ser Lys 


Cys 


Phe 


Gly Phe 
400 


Val 


Ser 


Phe 


Asp 
405 


Asn 


Pro 


Thr 


Ser 


Ala 
410 


Gin Thr 


Ala 


lie 


Gin Ala 
415 


Met 


Asn 


Gly 


Phe 
420 


Gin 


lie 


Gly Met 


Lys 


Arg Leu 


Lys 


Val 


Gin Leu 


Lys 


Arg 


Pro 


Lys 










425 








430 








435 


Asp 


Ala 


Asn 


Arg 


Pro 
440 


Tyr 
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<223> Incyte ID No: 8126343CD1 

<400> 11 



Met 


Ala 


Thr 


Asp 


Leu Pro 


He 


Met 


Ala Arg Gly Pro Ala 


Arg 


Ser 


1 








5 






10 




15 


Ala 


Ala 


Pro 


Ala 


Gly Gly 


Ser 


Ser 


Ser Gly Cys Gly Ala 


Arg 


Gin 










20 






25 




30 


Gly 


Arg 


Ala 


Gly 


Gly Gly 


Val 


Leu 


Ala Met Ala Gly Leu 


Ser 


Asp 










35 






40 




45 


Leu 


Glu 


Leu 


Arg 


Arg Glu 


Leu 


Gin 


Ala Leu Gly Phe Gin 


Pro 


Gly 










50 






55 




60 


Pro 


lie 


Thr 


Asp 


Thr Thr 


Arg 


Asp 


Val Tyr Arg Asn Lys 


Leu 


Arg 










65 






70 




75 


Arg 


Leu 


Arg 


Gly 


Glu Ala 


Arg 


Leu 


Arg Asp Glu Glu Arg 


Leu 


Arg 










80 






85 




90 


Glu 


Glu 


Ala 


Arg 


Pro Arg 


Gly 


Glu 


Glu Arg Leu Arg Glu 


Glu 


Ala 










95 






100 




105 


Arg 


Leu 


Arg 


Glu 


Asp Ala 


Pro 


Leu 


Arg Ala Arg Pro Ala 


Ala 


Ala 
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110 








115 










120 


Ser 


Pro 


Arg 


Ala 


Glu 


Pro 


Trp 


Leu Ser 


Gin 


Pro 


Ala 


Ser 


Gly 


Ser 










125 








130 










135 


Ala Tyr Ala Thr Pro Gly Ala Tyr Gly Asp 


He 


Arg 


Pro 


Ser 


Ala 










140 








145 










150 


7V 1 =» 
Hid 


OC=JL 


±ip 


vai 


Gly 


Ser 


Arg 


Glv Leu 


Ala 


Tyr 


Pro 


Ala 


Arg 


Pro 










155 








160 










165 


Ala 


pi t-i 


Leu 


Arg 


Arg 


Arg 


Ala 


Ser Val 


Arg 


Gly 


Ser 


Ser 


Glu 


Glu 










i 70 
1 / \j 








175 










180 


Asp 


pi ^ 


Asp 


Ala 


Arg 


Thr 


Pro 


ao^J Ax. y 


Ala 


Thr 


Gin 


Gly 


Pro 


Gly 










1 ft cr 

1 0 3 








190 










195 


T on 

Leu 


Ala 


Ala 


Arg 


Arg 


Trp Trp 


Ala Ala 


Ser 


Pro 


Ala 


Pro 


Ala 


Arg 


















205 










210 


Leu 


jfro 


Ser 


Ser 


Leu 


T 

lieu 


pi , r 

ijiy 


Jl J_ \J nop 


Pro 


Arg 


Pro Gly Leu 


Arg 




























225 


Ala 


*"Pl-» -v- 


Arg 


Ala 


pi •* » 
biy 


Pro 


Til — 

Ala 


Pi v 7\1 =, 

JL_y riia 


AT a 
Aia 


Arg 


Ala 


aj_ y 


Pro 


Glu 










0 *3 n 

Z jU 








235 










240 


vai 


pi T 


Arg 


Arg 


Leu 


blU 


Arg 


r P>"7-\ T i 


Ser 


Arg 


Leu 


Leu 


Leu 


Trp 


















£i J 










255 


Aia 


Ser 


Leu 


Gly 


Leu 


Leu 


Leu 


vai riic 


Leu 


Gly 


He 


Leu 


Trp 


Val 










Z D u 








0 j 










270 






Gly Lys 


Pro 


Ser 


Ala 


Pro Gin 


Glu 


Ala 


Glu 


Asp 


Asn 


Met 










Z / 3 








280 










285 


Lys 


Leu 


Leu 


Pro 


Val 


Asp 


Cys 


vji u Ai y 


Lys 


Thr Asp 


Glu 


Phe 


Cys 










290 








jl* Z) -J 










300 




Til- 
Ala 


Lys 


Gin 


Lys 


Ala 


Ala 


l_t ti Li JjCU 


Glu 


Leu 


Leu 


His 


Glu 


Leu 










305 








-> 1 VJ 










315 


Tyr* 


Asn 


Phe 


Leu 


Ala 


Tl « 

lie 


pi 
Gin 


Ai n pi 

Ala . vsiy 


Asn 


Phe 


Glu 


Cys 


sly 


Asn 










320 








-j ^ -j 










330 


Pro 


GlU 


Asn 


Leu 


Lys 


Ser 


Lys 


p.r 0 Tl Q 

Lys lie 


Pro 


Val 


Met 


Glu 


Ala 


Gin 










335 


















345 


Glu 


Tyr 


lie 


Ala 


Asn 


TT- 1 

vai 


rp"U» -»— 

±nr 


Ser Ser 


OCi 


Ser 


Ala 


Lys 


Phe 


Glu 










350 


















360 


Ala 


7,1 a 

Ala 


Leu 


Thr Trp 


He 


Leu 


OCX O CI 


A oil 


Lys 


Asp 


Val 


Gly 


He 










365 








0 / u 










375 


Trp 


Lieu. 


Lys 


Gly 


Glu 


Asp 


Gin 


o@i wiu 


Leu 


Val 


Thr 


Thr 


Val 


Asp 










380 








~? 0 -j 










390 


Lys 


vai 


Val 


Cys 


Leu 


Glu 


Ser 


Ala Ml C! 

Ala nio 


AT X V-> 


Arg Met Gly Val 


Gly 










395 








400 










405 


Cys 


Arg 


Leu 


Ser 


Arg 


Ala 


Leu 




Ala 


Val 


Thr 


Asn 


Val 


Leu 










410 








fi 13 










420 


lie 


Phe 


Phe 


Trp 


Cys 


Leu 


Ala 


Phe Leu 


Trp 


Gly Leu 


Leu 


He 


Leu 










425 








430 










435 


Leu 


Lys 


Tyr 


Arg 


Trp 


Arg 


Lys 


Leu Glu 


Glu 


Glu 


Glu 


Gin 


Ala 


Met 










440 








445 










450 


Tyr 


Glu 


Met 


Val 


Lys 


Lys 


He 


He Asp 


Val 


Val 


Gin 


Asp 


His 


Tyr 










455 








460 










465 


Val 


Asp 


Trp 


Glu 


Gin 


Asp 


Met 


Glu Arg 


Tyr 


Pro 


Tyr 


Val 


Gly 


He 










470 








475 










480 


Leu 


His 


Val 


Arg 


Asp 


Ser 


Leu 


He Pro 


Pro 


Gin 


Ser 


Arg 














485 








490 
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<220> 

<221> misc_feature 



<223> Incyte ID 


No : 


7044055CD1 










<400> 12 














Met 


Ala Ala Val 


Ser 


Leu Arg Leu Gly Asp Leu Val 


Trp 


Giy 


Joys 


1 




5 


t n 
1U 








lb 


Leu 


Gly Arg Tyr 


Pro 


Pro Trp Pro Gly Lys lie 


Val 


Asn 


Pro 


Pro 






20 


ZD 








30 


Lys 


Asp Leu Lys 


Lys 


Pro Arg Gly Lys Lys Cys 


rne 


Pne 


val 


Lys 






35 


a r\ 
40 








45 


Phe 


Phe Gly Thr 


Glu 


Asp His Ala Trp He Lys 


val 


Glu 


Gin 


Leu 






50 


c c 








60 


Lys 


Pro Tyr His 


Ala 


Hxs Lys Glu Glu Met lie 


Lys 


lie 


Asn 


T - -mm 

Lys 






65 


/ 0 








75 


Gly 


Lys Arg Phe 


Gin 


Gin Ala Val Asp Ala val 


Glu 


Glu 


Phe 


Leu 






80 










90 


Arg 


Arg Ala Lys 


Gly 


Lys Asp Gin Thr Ser Ser 


His 


Asn 


Ser 


Ser 






95 


100 








105 


Asp 


Asp Lys Asn 


Arg 


Arg Asn Ser Ser Glu Glu 


Arg 


Ser 


Arg 


Pro 






110 


115 








120 


Asn 


Ser Gly Asp 


Glu 


Lys Arg Lys Leu Ser Leu 


Ser 


Glu 


Gly 


Lys 






125 


130 








135 


Val 


Lys Lys Asn 


Met 


Gly Glu Gly Lys Lys Arg 


Val 


Ser 


Ser 


Gly 






140 


145 








150 


Ser 


Ser Glu Arg 


Gly 


Ser Lys Ser Pro Leu Lys 


Arg 


Ala 


Gin 


Glu 






155 


160 








165 


Gin 


Ser Pro Arg 


Lys 


Arg Gly Arg Pro Pro Lys 


Asp 


Glu 


Lys 


Asp 






170 


175 








180 


Leu 


Thr lie Pro 


Glu 


Ser Ser Thr Val Lys Gly Met Met 


Ala 


Gly 






185 


190 








195 


Pro 


Met Ala Ala 


Phe 


Lys Trp Gin Pro Thr Ala 


Ser 


Glu 


Pro 


Val 






200 


205 








210 


Lys 


Asp Ala Asp 


Pro 


His Phe His His Phe Leu 


Leu 


Ser 


Gin 


Thr 






215 


220 








225 


Glu 


Lys Pro Ala 


Val 


Cys Tyr Gin Ala lie Thr 


Lys 


Lys 


Leu 


Lys 






230 


235 








240 


lie 


Cys Glu Glu Glu 


Thr Gly Ser Thr Ser lie 


Gin 


Ala 


Ala 


Asp 






245 


250 








255 


Ser 


Thr Ala Val 


Asn 


Gly Ser lie Thr Pro Thr 


Asp 


Lys 


Lys 


He 






260 


265 








270 


Gly Phe Leu Gly Leu Gly Leu Met Gly Ser Gly lie 


Val 


Ser 


Asn 






275 


280 








285 


Leu 


Leu Lys Met 


Gly His Thr Val Thr Val Trp 


Asn 


Arg 


Thr 


Ala 






290 


295 








300 


Glu 


Lys Cys Asp 


Leu 


Phe lie Gin Glu Gly Ala 


Arg 


Leu 


Gly Arg 






305 


310 








315 


Thr 


Pro Ala Glu 


Val 


Val Ser Thr Cys Asp lie 


Thr 


Phe 


Ala 


Cys 






320 


325 








330 


Val 


Ser Asp Pro 


Lys 


Ala Ala Lys Asp Leu Val 


Leu Gly 


Pro 


Ser 






335 


340 








345 


Gly Val Leu Gin 


Gly 


lie Arg Pro Gly Lys Cys 


Tyr 


Val 


Asp 


Met 






350 


355 








360 


Ser 


Thr Val Asp 


Ala 


Asp Thr Val Thr Glu Leu 


Ala 


Gin 


Val 


He 






365 


370 








375 


Val 


Ser Arg Gly 


Gly 


Arg Phe Leu Glu Ala Pro 


Val 


Ser 


Gly 


Asn 
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380 








385 










390 


Gin 


Gin 


Leu 


Ser 


Asn 


Asp 


Gly 


Met Leu 


Val 


He 


Leu 


Ala 


Ala Gly 










395 








400 










405 


Asp 




Gly Leu 


Tvnr 


Glu 


Asp 


Cys Ser 


Ser 


Cys 


Phe 


Gin 


Ala 


Met 










410 








415 










420 


Gly 




1 IIX 




Phe 


Phe 


Leu Gly Glu Val 


Gly Asn 


Ala 


Ala 


Lys 










425 








430 










435 


Met 


Met 


Leu 


He 


Val 


Asn 


W 0 f- 
171 L- 


Val Gin Gly 


Ser 


Phe 


Met 


Ala 


Thr 










440 








*± *± -j 










450 


Tie 


Ala 


Glu 


Gly 


Leu 


Thi- 


Leu 


Ala Gin 


Val 


Thr 


Gly 


Gin 


Ser 


Gin 










455 








460 










465 


VJ7XX1 




Leu 


Leu 


Asp 


ll e 


Leu 


Asn Gin 


Gly 


Gin 


Leu 


Ala 


Ser 


He 










470 








475 










480 




J_i 6 Ll 


Asp 


Gin 


i_J_Y 


Cys 


Gin 


Asn He 


Leu 


Gin 


Gly Asn 


Phe 


Lys 










*± O «J 








490 










495 


Pro 


Asp 


Phe 


Tyr 


Leu 


Lys 


Tyr 


He Gin 


Lys 


Asp 


Leu 


Arg 


Leu 


Ala 










cnn 








505 










510 


He 


Ala 


Leu Gly 


Asp 


Ala 


Val 


Asn His 


Pro 


Thr 


Pro 


Met 


Ala 


Ala 










515 








520 










525 


Ala 


Ala 


Asn 


Glu 


Val 


Tyr 


Lys 


Arg Ala 


Lys 


Ala 


Leu 


Asp 


Gin 


Ser 










530 








535 










540 


Asp 


Asn 


Asp Met 


Ser 


Ala 


Val 


Tyr Arg 


Ala 


Tyr 


He 


His 














545 








550 













<210> 13 

<211> 1726 

<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No: 7493424CD1 



<400> 13 

Met Lys Ala Gin Lys Ser Gly Lys Glu Gin Gin Leu Asp He Met 
1 5 10 15 

Asn Lys Gin Tyr Gin Gin Leu Glu Ser Arg Leu Asp Glu He Leu 
20 25 30 

Ser Arg He Ala Lys Glu Thr Glu Glu He Lys Asp Leu Glu Glu 
35 40 45 

Gin Leu Thr Glu Gly Gin He Ala Ala Asn Glu Ala Leu Lys Lys 
50 55 60 

Asp Leu Glu Gly Val He Ser Gly Leu Gin Glu Tyr Leu Gly Thr 
65 70 75 

He Lys Gly Gin Ala Thr Gin Ala Gin Asn Glu Cys Arg Lys Leu 
80 85 90 

Arg Asp Glu Lys Glu Thr Leu Leu Gin Arg Leu Thr Glu Val Glu 
95 100 105 

Gin Glu Arg Asp Gin Leu Glu He Val Ala Met Asp Ala Glu Asn 
110 115 120 

Met Arg Lys Glu Leu Ala Glu Leu Glu Ser Ala Leu Gin Glu Gin 
125 130 135 

His Glu Val Asn Ala Ser Leu Gin Gin Thr Gin Gly Asp Leu Ser 
140 145 150 

Ala Tyr Glu Ala Glu Leu Glu Ala Arg Leu Asn Leu Arg Asp Ala 
155 160 165 
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Glu 


Ala 


Asn 


Gin 


Leu 


Lys 


Glu 


Glu 


Leu 


Glu 


Lys 


Val 


Thr 


Arg 


Leu 










170 










175 










180 


Thr 


Gin 


Leu 


Glu 


Gin 


Ser 


Ala 


Leu 


Gin 


Ala 


Glu 


Leu 


Glu 


Lys 


Glu 










185 










190 










195 


Arg 


Gin 


Ala 


Leu 


Lys 


Asn 


Ala 


Leu 


Gly Lys 


Ala 


Gin 


Phe 


Ser 


Glu 










200 










205 










210 


Glu 


Lys 


Glu 


Gin 


Glu 


Asn 


Ser 


Glu 


Leu 


His 


Ala 


Lys 


Leu 


Lys 


His 










215 










220 










225 


Leu 


Gin 


Asp 


Asp 


Asn 


Asn 


Leu 


Leu 


Lys 


Gin 


Gin 


Leu 


Lys Asp 


Phe 










230 










235 










240 


Gin 


Asn 


His 


Leu 


Asn 


His 


Val 


Val 


Asp 


Gly 


Leu 


Val 


Arg 


Pro 


Glu 










245 










250 










255 


Glu 


Val 


Ala 


Ala 


Arg 


Val 


Asp 


Glu 


Leu 


Arg 


Arg 


Lys 


Leu 


Lys 


Leu 










260 










265 










270 


Gly Thr Gly Glu Met 


Asn 


He 


His 


Ser 


Pro 


Ser 


Asp 


Val 


Leu 


Gly 










275 










280 










285 


Lvs 


Ser 


Leu 


Ala 


Asp 


Leu 


Gin 


Lys 


Gin 


Phe 


Ser 


Glu 


He 


Leu 


Ala 










290 










295 










300 


ArCT 


Ser Lys 


Trp 


Glu 


Arg 


Asp 


Glu 


Ala 


Gin 


Val 


Arg 


Glu 


Arg 


Lys 










305 










310 










315 


Leu 


Gin 


Glu 


Glu 


Met 


Ala 


Leu 


Gin 


Gin 


Glu 


Lys 


Leu 


Ala Thr Gly 










320 










325 










330 


Gin 


Glu 


Glu 


Phe 


Arg 


Gin 


Ala 


Cys 


Glu 


Arg 


Ala 


Leu 


Glu 


Ala 


Arg 










335 










340 










345 


Met 


Asn 


Phe 


Asp 


Lys 


Arg 


Gin 


His 


Glu 


Ala 


Arg 


He 


Gin 


Gin 


Met 










350 










355 










360 


Glu 


Asn 


Glu 


He 


His 


Tyr 


Leu 


Gin 


Glu 


Asn 


Leu 


Lys 


Ser 


Met 


Glu 










365 










370 










375 


Glu 


He 


Gin 


Gly 


Leu 


Thr 


Asp 


Leu 


Gin 


Leu 


Gin 


Glu 


Ala 


Asp 


Glu 










380 










385 










390 


Glu 


Lys 


Glu 


Arg 


He 


Leu 


Ala 


Gin 


Leu 


Arg 


Glu 


Leu 


Glu 


Lys 


Lys 










395 










400 










405 


Lys 


Lys 


Leu 


Glu 


Asp 


Ala 


Lys 


Ser 


Gin 


Glu 


Gin 


Val 


Phe Gly 


Leu 










410 










415 










420 


Asp 


Lys 


Glu 


Leu 


Lys 


Lys 


Leu 


Lys 


Lys 


Ala 


Val 


Ala 


Thr 


Ser 


Asp 










425 










430 










435 


Lys 


Leu 


Ala 


Thr 


Ala 


Glu 


Leu 


Thr 


He 


Ala 


Lys 


Asp 


Gin 


Leu 


Lys 










440 










445 










450 


Ser 


Leu 


His 


Gly Thr 


Val 


Met 


Lys 


He 


Asn 


Gin 


Glu 


Arg 


Ala 


Glu 










455 










460 










465 


Glu 


Leu 


Gin 


Glu 


Ala 


Glu 


Arg 


Phe 


Ser 


Arg 


Lys 


Ala 


Ala 


Gin 


Ala 










470 










475 










480 


Ala 


Arg 


Asp 


Leu 


Thr 


Arg 


Ala 


Glu 


Ala 


Glu 


He 


Glu 


Leu 


Leu 


Gin 










485 










490 










495 


Asn 


Leu 


Leu 


Arg 


Gin 


Lys 


Gly Glu 


Gin 


Phe 


Arg 


Leu 


Glu 


Met 


Glu 










500 










505 










510 


Lys Thr Gly Val 


Gly 


Thr 


Gly Ala 


Asn 


Ser 


Gin 


Val 


Leu 


Glu 


He 










515 










520 










525 


Glu 


Lys 


Leu 


Asn 


Glu 


Thr Met 


Glu 


Arg 


Gin 


Arg 


Thr 


Glu 


He 


Ala 










530 










535 










540 


Arg 


Leu 


Gin 


Asn 


Val 


Leu 


Asp 


Leu 


Thr 


Gly 


Ser 


Asp 


Asn 


Lys 


Gly 










545 










550 










555 


Gly 


Phe 


Glu 


Asn 


Val 


Leu 


Glu 


Glu 


He 


Ala 


Glu 


Leu 


Arg 


Arg 


Glu 










560 










565 










570 


Val 


Ser 


Tyr 


Gin 


Asn 


Asp 


Tyr 


He 


Ser 


Ser 


Met 
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pi 

Vj J.U 
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He 


Met 










95 








1UU 
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Asn 


Vai 
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■J\ "1 

Ala 


lie Tyr Glu Val 


T /->i ■» 

Lieu 


Arg 


Asn 


irne 


Gly Thr Val 










110 








1 1 c 
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120 


Leu 


Arg 


Leu 


Ser 


Pro 


Phe Arg 


Phe 


pi,, 

CjIU 


Asp 


irne 


Cys 


Ala 


Ala 


Leu 
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liU 










135 


Val 


Ser 


Gin 


Glu 


Gin 


Cys Thr 


Leu 


Met: 


Ala 


blu 




His 


Val 


Val 










140 








14b 
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Leu 


Leu 
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Ala 


Val 


Leu Arg 


Glu 


Glu 


Asp 


1 nr 


oer 


Asn 


Thr 


Thr 
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Asp 
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Thr 


Leu Tyr 
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Cys 
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Lys 


Glu 
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Tyr 


Gin 


Glu 


Ala 
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o i n 
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Pro 


Tyr Gly Pro Val 
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Lys Val 


Leu 










215 
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Leu 


Val 
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inr 
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Glu 
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Glu 
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Met 


Ser 


Glu 
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He 
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Tyr 
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Asp 
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Arg 
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Lys 
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Thr 


Cys 
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ZOO 
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Leu Glu Cys Val 


Lys 


Pro 


Pro 
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Glu 


Glu 


Val 
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Trp 
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Val 
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He 
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Arg 
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320 
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330 


Tyr 


Trp 


Pne 


Leu 


Asn 
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He 


He 
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Thr 
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Glu 
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Trp 
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Thr 


Lys 
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410 








415 










420 


lie 


Arg 


Ala 


Lys 
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Glu Glu 
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Val 
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Ser Asp 
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Met 
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J D 








940 
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X XI XT 


V3XU 
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His Val 
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Ser 


Gin Val 
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Asp 


VaX 


val 


Asn 


Val 


Ser Glu Gly 


Phe 
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Leu 


Arg 


Thr 


Ser Tyr 
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970 








975 


Lys 


Lys 


Lys 


Thr 


Lys 


Ser Ser 


Lys 


Leu 


Asp Gly Leu 


L eu 


Glu Arg 










980 








985 








990 


Arg 




Lys 


Gin 


Phe 


Thr Leu 


Glu 


Glu 


Lys 


Gin 


Arg 


Leu 


Glu Lvs 
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Leu 


Glu 


Gly Gly He 


Lys 


Gly 


He Gly Lys 


Thr 


Ser Thr 
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Asn 


Ser 


Ser 


Lys Asn 


Leu Ser 


Glu 


Ser 


Pro 


Val 


He 


Thr 


Lys Ala 
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1030 








1035 


Lys 
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Cys 


Gin 


Ser Asp 


Ser 


Met 
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Gin 


Glu 
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Ser Pro 








1040 






1045 








1050 


Asn 


Ala 
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Asn 
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Gin Pro 


Glu Asp Leu 


He 


Gin 


Gly Cys Ser 








1055 






1060 
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Ser 


Asp 


Ser 


Ser 


Val Leu 
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Met 


Ser 


Asp 


Pro 


Ser 


His Thr 
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1075 
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i nir 
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Leu 


Tyr 
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Asp 


Arg 


Val 


Leu 


Asp 


Asp 


Val Ser 
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1090 








1095 


jl ie 
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Ser 


Pro 


Glu 


Thr Lys 


Cys 


Pro 


Lys 


Gin 


Asn 


Ser 


He Glu 










1100 
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x±e 


Glu 


Glu Lys Val 


Ser 


Asp 
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Ala 


Ser 
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Ser 


Lys 
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Gly Asn 


Asp 
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He Asp 
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1145 








1150 
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Lys 
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Lys 


Lys 
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Ser 
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Thr 


He Val 
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1170 


Ser 


Ser 


Ser 
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Ser 
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Ser 


Ser 


Val 


Pro 


Lys 


Ser Thr 
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Asn 


Asp 


Arg 


Asp 


Ala 
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Leu 


Ser 


Arg 


Ala 


Met 


Asp 


Phe Glu 
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Gly Lys 


Leu Gly 


Cys 


Asp Ser 


Glu 


Ser 


Asn 


Ser 


Thr 


Leu 


Glu Asn 
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1210 
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Ser 


Ser 


Asp 


Thr 


Val 


Ser He 


Gin 


Asp 


' Ser 


Ser 


Glu 


Glu 


Asp Met 
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1230 


lie 


Val 


Gin 


Asn 


. Ser 


Asn Glu 


Ser 


He 


! Ser 


Glu 


Gin 


Phe 


Arg Thr 










1235 








1240 








1245 


Arg 


Glu 


Gin 


Asp Val 


Glu Val 


Leu 


Glu 


Pro 


Leu 


Lys 


Cys 


Glu Leu 










1250 








1255 








1260 


Val 


Ser 


Gly Glu Ser 


Thr Gly Asn 


Cys 


; Glu 


Asp 


Arg 


Leu 


Pro Val 
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Lys Gly Thr Glu Ala 
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Gin 


Lys Lys 
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Glu 


Glu 
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Lys 
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Lys He 
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Lys Gly Glu 
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Lys 


Glu He 
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Ser 


Glu 


Ser 


Arg Val 


Val 


Ser Gly 
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Glu 
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Lys 


Val Asn 








1355 








1360 
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Asn 


He 


Asn 


Lys He 


He 


Pro 


Glu 


Asn Asp 


He 
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Ser 


Leu Thr 
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1375 
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Val 


Lys 


Glu 


Ser Ala 


He 


Arg 
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Phe He Asn Gly Asp Val He 
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1390 
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Met 


Glu 


Asp 


Phe Asn 


Glu 


Arg 
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Ser Ser 


Glu 


Thr 


Lys 


Ser His 








1400 
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Leu 


Leu 


Ser 


Ser Ser 


Asp 


Ala 


Glu 


Gly Asn 


Tyr 


Arg 


Asp 


Ser Leu 








1415 








1420 








1425 


Glu 


Thr 


Leu 


Pro Ser 


Thr 


Lys 


Glu 


Ser Asp 


Ser 


Thr 


Gin 


Thr Thr 








1430 








1435 








1440 


Thr 


Pro 


Ser 


Ala Ser 
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Glu 


Ser Asn 


Ser 


Val 


Asn 


Gin Val 








1445 








1450 
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Glu 


Asp 


Met 


Glu He 


Glu 


Thr 


Ser 


Glu Val 


Lys 


Lys 


Val 


Thr Ser 








1460 
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Ser 
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He 


Thr Ser 


Glu 


Glu 


Glu 


Ser Asn 


Leu 


Ser 


Asn 


Asp Phe 
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He 
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Glu 


Asn Gly 
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He 
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Asn 


Glu 


Asn 


Val Asn 
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Gly 


Glu 


Ser 


Lys Arg 


Lys 


Thr 


Val 


He Thr 


Glu 


Val 


Thr 


Thr Met 
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Thr 


Ser 


Thr 


Val Ala 


Thr 


Glu 


Ser 


Lys Thr 


Val 


He 


Lys 


Val Glu 








1520 








1525 








1530 


Lys 


Gly 
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Val 
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Ser Ser 


Thr 


Glu 
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Cys Ala 
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Thr 


Thr 


Thr 
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Val 
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will 


nl v Hi n 
<jiy w-Lii 








1910 








1915 








i Q?n 

IJ^U 


Sear 


Asn 


Ser 


Gly Val 


Val 


Gin 


Val 


Gin Gin 


Lys 


\7^1 
v al 


Leu. 


r>l v Tl ^ 
oiy lit- 








1925 








1930 








1 Q'JC 
X -7 -J ~J 


x±.e 


Pro 


Ser 


Ser Thr Gly 


Thr 


Ser 


Gin Gin 


Thr 




X 1X1 


Cor- "Prt«s» 








1940 








1945 








1 950 

13 Ju 


Gin 


Pro 


Arg 


Thr Ala 


Thr 


val 


Thr 


He Arg 


Pro 


Asn 


i nr 


oex uiy 








1955 








1960 








1 Qfi5 


Ser 


Gly 


Gly 


Thr Thr 


Ser 


Asn 


Ser 


Gin Val 


He 


jl nr 


uiy 


Pro Gin 








1970 








1975 








1 QRO 

1-7 O f 


He 


Arg 


Pro 


Gly Met 


Thr 


val 


He 


Arg Thr 


Pro 


Leu 


/*2l v-i 

win 


win oer 








1985 








1990 








1 QQC, 
X _7 -7 _J 


Thr 


Leu 


Gly 


Lys Ala 


He 


Tl 

lie 


Arg 


Thr Pro 


Val 




v dX 


will nu 








2000 








2005 








9010 


Gly Ala 


Pro 


Gin Gin 


Val 


rlSL 


Thr 


Gin He 


He 


Arg 


Gly 


Gin Pro 








2015 








2020 








2025 


Val 


Ser 


Thr 


Ala Val 


Ser 


Ala 


Pro 


Asn Thr 


Val 


Ser 


Ser 


Thr Pro 








2030 








2035 








2040 


Gly 


Gin 


Lys 


Ser Leu 


Thr 


Ser 


Ala 


Thr Ser 


Thr 


Ser 


Asn 


He Gin 








2045 








2050 








2055 


Ser 


Ser 


Ala 


Ser Gin 


Pro 


Pro 


Arg 


Pro Gin 


Gin Gly Gin Val Lys 








2060 








2065 








2070 


Leu 


Thr 


Met 


Ala Gin 


Leu 


Thr 


Gin 


Leu Thr 


Gin 


Gly 


His 


Gly Gly 








2075 








2080 








2085 


Asn 


Gin 


Gly Leu Thr Val 


Val 


He Gin Gly Gin Gly Gin Thr Thr 








2090 








2095 








2100 


Gly 


Gin 


Leu 


Gin Leu 


He 


Pro 


Gin Gly Val 


Thr 


Val 


Leu 


Pro Gly 








2105 








2110 








2115 


Pro 


Gly 


Gin 


Gin Leu 


Met 


Gin 


Ala 


Ala Met 


Pro 


Asn 


Gly Thr Val 








2120 








2125 








2130 
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Gin Arg Phe Leu Phe Thr Pro Leu Ala Thr Thr Ala Thr Thr Ala 
2135 2140 2145 

Ser Thr Thr Thr Thr Thr Val Ser Thr Thr Ala Ala Gly Thr Gly 
2150 2155 2160 

Glu Gin Arg Gin Ser Lys Leu Ser Pro Gin Met Gin Val His Gin 
2165 2170 2175 

Asp Lys Thr Leu Pro Pro Ala Gin Ser Ser Ser Val Gly Pro Ala 
2180 2185 2190 

Glu Ala Gin Pro Gin Thr Ala Gin Pro Ser Ala Gin Pro Gin Pro 
2195 2200 2205 

Gin Thr Gin Pro Gin Ser Pro Ala Gin Pro Glu Val Gin Thr Gin 
2210 2215 2220 

Pro Glu Val Gin Thr Gin Thr Thr Val Ser Ser His Val Pro Ser 
2225 2230 2235 

Glu Ala Gin Pro Thr His Ala Gin Ser Ser Lys Pro Gin Val Ala 
2240 2245 2250 

Ala Gin Ser Gin Pro Gin Ser Asn Val Gin Gly Gin Ser Pro Val 
2255 2260 2265 

Arg Val Gin Ser Pro Ser Gin Thr Arg lie Arg Pro Ser Thr Pro 
2270 2275 2280 

Ser Gin Leu Ser Pro Gly Gin Gin Ser Gin Val Gin Thr Thr Thr 
2285 2290 2295 

Ser Gin Pro He Pro He Gin Pro His Thr Ser Leu Gin He Pro 
2300 2305 2310 

Ser Gin Gly Gin Pro Gin Ser Gin Pro Gin Val Val Met Lys His 
2315 2320 2325 

Asn Ala Val lie Glu His Leu Lys Gin Lys Lys Ser Met Thr Pro 
2330 2335 2340 

Ala Glu Arg Glu Glu Asn Gin Arg Met He Val Cys Asn Gin Val 
2345 2350 2355 

Met Lys Tyr He Leu Asp Lys He Asp Lys Glu Glu Lys Gin Ala 
2360 2365 2370 

Ala Lys Lys Arg Lys Arg Glu Glu Ser Val Glu Gin Lys Arg Ser 
2375 2380 2385 

Lys Gin Asn Ala Thr Lys Leu Ser Ala Leu Leu Phe Lys His Lys 
2390 2395 2400 

Glu Gin Leu Arg Ala Glu He Leu Lys Lys Arg Ala Leu Leu Asp 
2405 2410 2415 

Lys Asp Leu Gin He Glu Val Gin Glu Glu Leu Lys Arg Asp Leu 
2420 2425 2430 

Lys He Lys Lys Glu Lys Asp Leu Met Gin Leu Ala Gin Ala Thr 
2435 2440 2445 

Ala Val Ala Ala Pro Cys Pro Pro Val Thr Pro Ala Pro Pro Ala 
2450 2455 2460 

Pro Pro Ala Pro Pro Pro Ser Pro Pro Pro Pro Pro Ala Val Gin 
2465 2470 2475 

His Thr Gly Leu Leu Ser Thr Pro Thr Leu Pro Ala Ala Ser Gin 
2480 2485 2490 

Lys Arg Lys Arg Glu Glu Glu Lys Asp Ser Ser Ser Lys Ser Lys 
2495 2500 2505 

Lys Lys Lys Met He Ser Thr Thr Ser Lys Glu Thr Lys Lys Asp 
2510 2515 2520 

Thr Lys Leu Tyr Cys He Cys Lys Thr Pro Tyr Asp Glu Ser Lys 
2525 2530 2535 

Phe Tyr He Gly Cys Asp Leu Cys Thr Asn Trp Tyr His Gly Glu 
2540 2545 2550 
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Cys Val Gly He Thr Glu Lys Glu Ala Lys Lys Met Asp Val Tyr 

2555 2560 2565 

He Cys Asn Asp Cys Lys Arg Ala Gin Glu Gly Ser Ser Glu Glu 

2570 2575 2580 

Leu Tyr Cys He Cys Arg Thr Pro Tyr Asp Glu Ser Gin Phe Tyr 

2585 2590 2595 

He Gly Cys Asp Arg Cys Gin Asn Trp Tyr His Gly Arg Cys Val 

2600 2605 2610 

Gly He Leu Gin Ser Glu Ala Glu Leu He Asp Glu Tyr Val Cys 

2615 2620 2625 

Pro Gin Cys Gin Ser Thr Glu Asp Ala Met Thr Val Leu Thr Pro 

2630 2635 2640 . 

Leu Thr Glu Lys Asp Tyr Glu Gly Leu Lys Arg Val Leu Arg Ser 

2645 2650 2655 

Leu Gin Ala His Lys Met Ala Trp Pro Phe Leu Glu Pro Val Asp 

2660 2665 2670 

Pro Asn Asp Ala Pro Asp Tyr Tyr Gly Val He Lys Glu Pro Met 

2675 2680 2685 

Asp Leu Ala Thr Met Glu Glu Arg Val Gin Arg Arg Tyr Tyr Glu 

2690 2695 2700 

Lys Leu Thr Glu Phe Val Ala Asp Met Thr Lys He Phe Asp Asn 

2705 2710 2715 

Cys Arg Tyr Tyr Asn Pro Ser Asp Ser Pro Phe Tyr Gin Cys Ala 

2720 2725 2730 

Glu Val Leu Glu Ser Phe Phe Val Gin Lys Leu Lys Gly Phe Lys 

2735 2740 2745 

Ala Ser Arg Ser His Asn Asn Lys Leu Gin Ser Thr Ala Ser 

2750 2755 

<210> 16 

<211> 613 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 5093550CD1 

<400> 16 

Met Asp Val Ala He Glu Phe Ser Val Glu Glu Trp Gin Cys Leu 
15 10 15 

Asp Thr Ala Gin Gin Asn Leu Tyr Arg Asn Val Met Leu Glu Asn 

20 25 30 

Tyr Arg Asn Leu Val Phe Leu Gly He Ala Val Ser Lys Pro Asp 

35 40 45 

Leu He Thr Cys Leu Glu Gin Gly Lys Glu Pro Trp Asn Met Glu 

50 55 60 

Arg His Glu Met Val Ala Lys Pro Pro Gly Met Cys Cys Tyr Phe 

65 70 75 

Ala Gin Asp Leu Arg Pro Glu Gin Ser He Lys Ala Ser Leu Gin 

80 85 90 

Arg He He Leu Arg Lys Tyr Glu Lys Cys Gly His His Asn Leu 

95 100 105 

Gin Leu Lys Lys Gly Tyr Lys Ser Val Asp Glu Tyr Lys Val His 
110 115 120 

Lys Gly Ser Tyr Asn Gly Phe Asn Gin Cys Leu Thr Thr Thr Gin 
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125 










130 










135 


Ser 


Lys 


He Phe 


Gin 


Cys 


Asp 


Lys 


Tyr Val 


Lys 


Asp 


Phe 


His 


Lys 








140 










145 










150 


Phe Ser Asn Ser Asn Arg His Lys Thr Glu 


Lys 


Asn 


P-ro 


Phe 


Lys 








155 










i fin 










165 

X v ~J 


Cys 


Lys 


kj jlu uys 


Gly Lys 


Ser 






V d _L 


Leu 


Ser 


His 


Leu 


Thr 








170 










1TC 
J. / 3 










JL O \J 


Gin 


His 


Lys Arg 


He 


His 


Thr 


j.nr 


vai 


Asn 


Ser Tyr 


Lys 


Leu 


Glu 








185 










i qd 

J..7 u 










195 


Glu 


Cys 


Gly Lys 


Ala 


Phe 


Asn 


va j. 






Thr 


Leu 


Ser 


Gin 


His 








200 










o n r 

ZUD 










210 


Lys 


Arg 


He His 


Thr Gly Gin 


.Lys 


ills 


Tyr 


Lys 


Cys 


vjIU 




Cys 








215 










A A U 










9 9 5 


Gly 


lie 


Ala Phe 


Asn 


Lys 


Ser 


Ser 


rilS 


Leu 


Asn 


Thr 




Lys 


lie 








230 




















9 An 


He 


His 


Thr Gly 


Glu 


Lys 


Ser 


Tyr 


Lys 


Arg 


Glu 


Glu 


Cys 


Gly 


Lys 








245 










A DU 










255 


Ala 


Phe 


Asn He 


Ser 


Ser 


His 


Leu 


Thr 


i nr 


His 


Lys 


He 


He 


His 








260 










OCR 










270 


Thr Gly 


Glu Asn 


Ala 


Tyr 


Lys 


Cys 


Lys 


Glu 


Cys 


Gly 


Lys 


Ala 


Phe 








275 










280 










285 


Asn 


Gin 


Ser Ser 


Thr 


Leu 


Thr 


Arg 


His 


Lys 


He He His Ala Gly 








290 










295 










■-> r\r\ 

300 


Glu 


Lys 


Pro Tyr 


He 


Cys 


Glu 


His 


Cys 


Gly 


Arg 


Ala 


Phe 


Asn 


GJ_n 








305 










310 










315 


Ser 


Ser 


Asn Leu 


Thr 


Lys 


His 


Lys 


Arg 


He 


His 


Thr Gly Asp 


Lys 








320 










325 










330 




Tyr 


Lys Cys 


Glu 


Glu 


Cys 


Gly Lys 


Ala 


Phe 


Asn 


Val 


Ser 


Ser 








335 










340 










345 


Thr 


Leu 


Thr Gin 


His 


Lys 


Arg 


He His Thr Gly Glu Lys 


Pro 


Tyr 








350 










355 










360 


Lys 


Cys 


Glu Glu 


Cys 


Gly 


Lys 


Ala 


Phe 


Asn 


Val 


Ser 


Ser 


Thr 


Leu 








365 










370 










375 


Thr 


Gin 


His Lys 


Arg 


He 


His 


Thr Gly 


Glu 


Lys 


Pro 


Tyr 


Lys 


Cys 








380 










385 










390 


Glu 


Glu 


Cys Gly 


Lys 


Ala 


Phe 


Asn 


Thr 


Ser 


Ser 


His 


Leu 


Thr 


Thr 








395 










400 










405 


His 


Lys 


Arg He 


His 


Thr Gly Glu Lys 


Pro 


Tyr 


Lys 


Cys 


Glu 


Glu 








410 










415 










420 


Cys 


Gly 


Lys Ala 


Phe 


Asn 


Gin 


Phe 


Ser 


Gin 


Leu 


Thr 


Thr 


His 


Lys 








425 










430 










435 


He 


He 


His Thr Gly Glu 


Lys 


Pro 


Tyr 


Lys 


Cys 


Lys 


Glu 


Cys 


Gly 








440 










445 










450 


Lys 


Ala 


Phe Lys 


Arg 


Ser 


Ser 


Asn 


Leu 


Thr 


Glu 


His 


Arg 


He 


He 








455 










460 










465 


His 


Thr 


Gly Glu Lys 


Pro 


Tyr Lys 


Cys 


Glu 


Glu 


Cys 


Gly Lys 


Ala 








470 










475 










480 


Phe 


Asn 


Leu Ser 


Ser 


His 


Leu 


Thr 


Thr 


His 


Lys 


Lys 


He 


His 


Thr 








485 










490 










495 


Gly 


Glu 


Lys Pro 


Tyr 


Lys 


Cys 


Lys 


Glu 


Cys 


Gly 


Lys 


Ala 


Phe 


Asn 








500 










505 










510 


Gin 


Ser 


Ser Thr 


Leu 


Ala 


Arg 


His 


Lys 


He 


He 


His 


Ala 


Gly 


Glu 








515 










520 










525 


Lys 


Pro 


Tyr Lys 


Cys 


Glu 


Glu 


Cys 


Gly Lys 


Ala 


Phe 


Tyr 


Gin 


Tyr 








530 










535 










540 


Ser 


Asn 


Leu Thr 


Gin 


His 


Lys 


He 


He 


His 


Thr 


Gly 


Glu 


Lys 


Pro 
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545 






550 








555 


Tvx 


Lvs Cvs 


Glu 


Glu Cys 


Gly Lys 


Ala 


Phe Asn 


Trp 


Ser 


Ser 


Thr 








560 






565 








570 


Leu 


Thr Lys 


His 


Lys Val 


lie His 


Thr 


Gly Glu 


Lys 


Pro 


Tyr 


Lys 








575 






580 








585 


Cys 


Lys Glu 


Cys 


Gly Lys 


Ala Phe 


Asn 


Gin Cys 


Ser 


Asn 


Leu 


Thr 








590 






595 








600 


Thr 


His Lys 


Lys 


lie His 


Ala Val 


Glu 


Lys Ser 


Asp 


Lys 












605 






610 











<210> 17 
<211> 240 
<212> PRT 

<213> Homo sapiens 



<220> 

<221> mis cofeature 

<223> Incyte ID No: 7487977CD1 



<400> 17 



Met 


Ser 


Lys 


Pro 


Val 


Asp 


His 


Val 


Lys 


Arg 


Pro 


Met 


Asn 


Ala 


Phe 


1 








5 










10 










15 


Met 


Val 


Trp 


Ser 


Arg 


Ala 


Gin 


Arg 


Arg 


Lys 


Met 


Ala 


Gin 


Glu 


Asn 










20 










25 










30 


Pro 


Lys 


Met 


His 


Asn 


Ser 


Glu 


He 


Ser 


Lys 


Arg 


Leu 


Gly 


Ala 


Glu 










35 










40 










45 


Trp 


Lys 


Leu 


Leu 


Ser 


Glu 


Ala 


Glu 


Lys 


Arg 


Pro 


Tyr 


He 


Asp 


Glu 










50 










55 










60 


Ala 


Lys 


Arg 


Leu 


Arg 


Ala 


Gin 


His 


Met 


Lys 


Glu 


His 


Pro 


Asp 


Tyr 










65 










70 










75 


Lys 


Tyr 


Arg 


Pro 


Arg 


Arg 


Lys 


Pro 


Lys 


Asn 


Leu 


Leu 


Lys 


Lys 


Asp 










80 










85 










90 


Arg 


Tyr 


Val 


Phe 


Pro 


Leu 


Pro 


Tyr 


Leu 


Gly Asp Thr 


Asp 


Pro 


Leu 










95 










100 










105 


Lys 


Ala 


Ala Gly 


Leu 


Pro 


Val 


Gly 


Ala 


Ser 


Asp Gly 


Leu 


Leu 


Ser 










110 










115 










120 


Ala 


Pro 


Glu 


Lys 


Ala 


Arg 


Ala 


Phe 


Leu 


Pro 


Pro 


Ala 


Ser 


Ala 


Pro 










125 










130 










135 


Tyr 


Ser 


Leu 


Leu 


Asp 


Pro 


Ala 


Gin 


Phe 


Ser 


Ser 


Ser 


Ala 


He 


Gin 










140 










145 










150 


Lys Met 


Gly 


Glu 


Val 


Pro 


His 


Thr 


Leu 


Ala 


Thr Gly 


Ala 


Leu 


Pro 










155 










160 










165 


Tyr 


Ala 


Ser Thr Leu Gly Tyr Gin 


Asn 


Gly Ala 


Phe 


Gly 


Ser 


Leu 










170 










175 










180 


Ser 


Cys 


Pro 


Ser 


Gin 


His 


Thr 


His 


Thr 


His 


Pro 


Ser 


Pro 


Thr 


Asn 










185 










190 










195 


Pro 


Gly Tyr Val 


Val 


Pro 


Cys 


Asn 


Cys 


Thr 


Ala 


Trp 


Ser 


Ala 


Ser 










200 










205 










210 


Thr 


Leu 


Gin 


Pro 


Pro 


Val 


Ala 


Tyr 


He 


Leu 


Phe 


Pro 


Gly Met 


Thr 










215 










220 










225 


Lys Thr Gly lie 


Asp 


Pro 


Tyr 


Ser 


Ser 


Ala 


His 


Ala 


Thr 


Ala 


Met 










230 










235 










240 



<210> 18 
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<212> PRT 

<213> Homo sapiens 



<220> 

<221> iaisc_f eature 

<223> Incyte ID No: 1706514CD1 



<400> 18 



Met Asp 


Ser 


Val 


Val 


Phe 


Glu 


Asp 


Val 


Ala 


Val 


Asp 


Phe 


Thr 


Leu 


1 






5 










10 










15 


Glu Glu 


Trp 


Ala 


Leu 


Leu 


Asp 


Ser 


Ala 


Gin 


Arg 


Asp 


Leu 


Tyr 


Arg 








20 










25 










30 


Asp Val 


Met 


Leu 


Glu 


Thr 


Phe 


Arg 


Asn 


Leu 


Ala 


Ser 


Val 


Asp 


Asp 








35 










40 










45 


Gly Thr 


Gin 


Phe 


Lys 


Ala 


Asn 


Gly 


Ser 


Val 


Ser 


Leu 


Gin 


Asp 


Met 








50 










55 










60 


Tvr Glv 


Gin 


Glu 


Lys 


Ser 


Lys 


Glu 


Gin 


Thr 


He 


Pro 


Asn 


Phe 


Thr 








65 










70 










75 


rzl -w - Asn 


Asn 


Ser 


Cys 


Ala 


Tyr 


Thr 


Leu 


Glu 


Lys 


Asn 


Cys 


Glu 


Gly 








80 










85 










90 


Tvr Glv 


Thr 


Glu 


Asp 


His 


His 


Lys 


Asn 


Leu 


Arg 


Asn 


His 


Met 


Val 








95 










100 










105 


n is ^ ru. y 


Phe Cys Thr His Asn Glu Gly Asn Gin Tyr Gly Glu 


Ala 








110 










115 










120 


He His 


Gin 


Met 


Pro 


Asp 


Leu 


Thr 


Leu 


His 


Lys 


Lys 


Val 


Ser 


Ala 


















130 










135 


Gly Glu 


Lys 


Pro 


Tyr 




Cys 


Thr 


Lys 


Cys 


Arg 


Thr 


Val 


Phe 


Thr 








140 










145 










150 


ii X ±JCSLX 


Ser 


Ser 


Leu 


Lys 


Arg 


His 


Val 


Lys 


Ser 


His 


Cys 


Gly Arg 








155 










160 










165 


Ta/^ Ala 


Pro 


Pro 


Gly 


Glu 


Glu 


Cys 


Lys 


Gin 


Ala 


Cys 


He 


Cys 


Pro 








170 










175 










180 


Ser His 


Leu 


His 


Ser 


His 


Gly 


Arg 


Thr 


Asp 


Thr 


Glu 


Glu 


Lys 


Pro 








185 










190 










195 


Tyr Lys 


Cys 


Gin 


Ala 


Cys 


Gly 


Gin 


Thr 


Phe 


Gin 


His 


Pro 


Arg 


Tyr 








200 










205 










210 


T .cai l Opf 


His 


His 


Val 


Lys 


Thr 


His 


Thr 


Ala 


Glu 


Lys 


Thr 


Tyr 


Lys 








215 










220 










225 


Cys Glu 


Gin 


Cys 


Arg 


Met 


Ala 


Phe 


Asn Gly 


Phe 


Ala 


Ser 


Phe 


Thr 








230 










235 










240 


Arg His 


Val 


Arg 


Thr 


His 


Thr 


Lys 


Asp 


Arg 


Pro 


Tyr 


Lys 


Cys 


Gin 








245 










250 










255 


Glu Cys 


Gly Arg 


Ala 


Phe 


He 


Tyr 


Pro 


Ser 


Thr 


Phe 


Gin 


Arg 


His 








260 










265 










270 


Met Thr 


Thr 


His 


Thr 


Gly 


Glu 


Lys 


Pro 


Tyr 


Lys 


Cys 


Gin 


His 


Cys 








275 










280 










285 


Gly Lys 


Ala 


Phe 


Thr 


Tyr 


Pro 


Gin 


Ala 


Phe 


Gin 


Arg 


His 


Glu 


Lys 








290 










295 










300 


Thr His 


Thr Gly 


Glu 


Lys 


Pro 


Tyr 


Glu 


Cys 


Lys 


Gin 


Cys 


Gly 


Lys 








305 










310 










315 


Thr Phe 


Ser 


Trp 


Ser 


Glu 


Thr 


Leu 


Arg Val 


His 


Met 


Arg 


He 


His 








320 










325 










330 


Thr Gly 


Asp 


Lys 


Leu 


Tyr 


Lys 


Cys 


Glu 


His 


Cys 


Gly Lys 


Ala 


Phe 








335 










340 










345 


Thr Ser 


Ser 


Arg 


Ser 


Phe 


Gin 


Gly 


His 


Leu 


Arg 


Thr 


His 


Thr 


Gly 








350 










355 










360 
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Lys 


XT ±. KJ 


A y x 


Glu 




Lys 


Gin Cys 


Gly Lys 


Ala 


Phe 


Thr 


Trp 










365 








370 










375 


Ser 


Ser 


Thr 


Phe 


Arg 
380 


Glu 


His 


Val Arg 


He 
385 


His 


Thr 


Gin 


Glu 


Gin 
390 


Leu 


Tyr 


Lys 


Cys 


Glu 
395 


Gin 


Cys 


Gly Lys 


Ala 
400 


Phe 


Thr 


Ser 


Ser 


Arg 
405 


Ser 


Phe 


Arg 


Gly 


His 


Leu 


Arg 


Thr His 


Thr Gly 


Glu 


Lys 


Pro 


Tyr 










410 








415 










420 


Glu 


Cys 


Lys 


Gin 


Cys 

*± Z. J 


Gly 


Lys 


Thr Phe 


Thr 
430 


Ttt} 

A1 r 


Ser 


Ser 


Thr 


Phe 
435 


Arg 


Glu 


His 


Val 


Arg 
a An 

f± *± VJ 


lie 


His 


Thr Gin 


OX U. 

445 


Gin 


Leu 


His 


Lys 


Cys 
450 




nlS 


Cys 


a~\ vr 
oxy 


Lys 
455 


Ala 


* lie 


Thr Ser 


Del 

460 


Arg 


Ala 


Phe 


Gin 


Gly 
465 


His 


Leu 


Arg 


Met 


His 
470 


Thr 


Gly 


Glu Lys 


Pro 
475 


Tyr 


Glu 


Cys 


Lys 


Gin 
480 


Cys 


Gly 


Lys 


Thr 


Phe 
485 


Thr 


Trp 


Ser Ser 


Thr 
490 


Leu 


His 


Asn 


His 


Val 
495 


Arg 


Met 


His 


Thr 


Gly Glu 


Lys 


Pro His 


Lys 


Cys 


Lys 


Gin 


Cy s 


Gly 










500 








505 










510 


Met 


Ser 


Phe 


Lys 


Trp 
515 


His 


Ser 


Ser Phe 


Arg 
520 


Asn 


His 


Leu 


Arg 


Met 
525 


His 


Thr 


Gly 


Gin 


Lys 
530 


Ser 


His 


Glu Cys 


Gin 
535 


Ser 


Tyr 


Ser 


Lys 


Ala 
540 


Phe 


Ser 


Cys 


Gin 


Val 
545 


He 


Leu 


Ser Lys 


Thr 
550 


Ser 


Glu 


Ser 


Thr 


His 
555 



<210> 19 
<211> 184 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7488247CD1 

<400> 19 



Met 


Ala 


Asp His 


Leu 


Met 


Leu 


Ala' Glu 


Gly 


Tyr 


Arg 


Leu 


Val 


Gin 


1 






5 








10 










15 


Arg 


Pro 


Pro Ser 


Ala 


Ala 


Ala 


Ala His 


Gly 


Pro 


His 


Ala 


Leu 


Arg 








20 








25 










30 


Thr 


Leu 


Pro Pro 


Tyr 


Ala 


Gly 


Pro Gly Leu 


Asp 


Ser Gly Leu 


Arg 








35 








40 










45 


Pro 


Arg 


Gly Ala 


Pro 


Leu 


Gly 


Pro Pro 


Pro 


Pro 


Arg 


Gin 


Pro 


Gly 








50 








55 










60 


Ala Leu Ala Tyr Gly Ala 


Phe 


Gly Pro 


Pro 


Ser 


Ser 


Phe 


Gin 


Pro 








65 








70 










75 


Phe 


Pro 


Ala Val 


Pro 


Pro 


Pro 


Ala Ala 


Gly 


He 


Ala 


His 


Leu 


Gin 








80 








85 










90 


Pro 


Val 


Ala Thr 


Pro 


Tyr 


Pro 


Gly Arg 


Ala 


Ala 


Ala 


Pro 


Pro 


Asn 








95 








100 










105 


Ala 


Pro 


Gly Gly 


Pro 


Pro 


Gly 


Pro Gin 


Pro 


Ala 


Pro 


Ser 


Ala 


Ala 








110 








115 










120 


Ala 


Pro 


Pro Pro 


Pro 


Ala 


His 


Ala Leu 


Gly 


Gly Met 


Asp 


Ala 


Glu 








125 








130 










135 
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Leu lie Asp Glu Glu Ala Leu Thr Ser Leu Glu Leu Glu Leu Gly 
140 145 150 

Leu His Arg Val Arg Glu Leu Pro Glu Leu Phe Leu Gly Gin Ser 
155 160 165 

Glu Phe Asp Cys Phe Ser Asp Leu Gly Ser Ala Pro Pro Ala Gly 
170 175 180 

Ser Val Ser Cys 



<210> 20 

<211> 553 

<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No: 1427269CD1 



<400> 20 



Met Pro 


Gly Met 


Met 


Glu 


Lys 


Gly 


Pro 


Glu 


Leu 


Leu 


Gly 


Lys 


Asn 


1 






5 










10 










15 


Arg Ser 


Ala 


Asn 


Gly 


Ser 


Ala 


Lys 


Ser Pro Ala Gly Gly Gly Gly 








20 










25 










30 


Ser Gly 


Ala 


Ser 


Ser 


Thr 


Asn 


Gly 


Gly 


Leu 


His 


Tyr 


Ser 


Glu 


Pro 








3d 










40 










45 


Glu Ser Gly 


Cys 


Ser 


Ser 


Asp 


Asp 


Glu 


His 


Asp 


Val 


Gly Met 


Arg 








50 










55 










60 


Val Gly 


Al^ 


Glu 


Tyr 


Gin 


Ala 


Arg 


He 


Pro 


Glu 


Phe 


Asp 


Pro Gly 








65 










70 










75 


Ala Thr 


Lys 


Tyr 


Thr 


Asp 


Lys 


Asp Asn Gly Gly Met 


Leu 


Val 


Trp 








o r\ 
oU 










85 










90 


Ser Pro 


Tyr 


His 


Ser 


He 


Pro 


Asp 


Ala 


Lys 


Leu 


Asp 


Glu 


Tyr 


He 








95 










100 










105 


Ala lie 


Ala 


Lys 


Glu 


Lys 


His 


Gly Tyr Asn Val 


Glu 


Gin 


Ala 


Leu 








110 










115 










120 


Gly Met 


Leu 


Phe 


Trp 


His 


Lys 


His 


Asn 


He 


Glu 


Lys 


Ser 


Leu 


Ala 








125 










130 










135 


Asp Leu 


Pro 


Asn 


Phe 


Thr 


Pro 


Phe 


Pro 


Asp 


Glu 


Trp 


Thr 


Val 


Glu 








140 










145 










150 


Asp Lys 


Val 


Leu 


Phe 


Glu 


Gin 


Ala 


Phe 


Ser 


Phe 


His 


Gly 


Lys 


Ser 








155 










160 










165 


Phe His 


Arg 


He 


Gin 


Gin 


Met 


Leu 


Pro 


Asp 


Lys 


Thr 


He 


Ala 


Ser 








170 










175 










180 


Leu Val 


Lys 


Tyr 


Tyr 


Tyr 


Ser 


Trp 


Lys 


Lys 


Thr 


Arg 


Ser 


Arg 


Thr 








185 










190 










195 


Ser Leu 


Met 


Asp 


Arg 


Gin 


Ala 


Arg 


Lys 


Leu 


Ala 


Asn 


Arg 


His 


Asn 








200 










205 










210 


Gin Gly 


Asp 


Ser 


Asp 


Asp 


Asp 


Val 


Glu 


Glu 


Thr 


His 


Pro 


Met 


Asp 








215 










220 










225 


Gly Asn 


Asp 


Ser 


Asp 


Tyr 


Asp 


Pro 


Lys 


Lys 


Glu 


Ala 


Lys 


Lys 


Glu 








230 










235 










240 


Gly Asn 


Thr 


Glu 


Gin 


Pro 


Val 


Gin 


Thr 


Ser 


Lys 


He 


Gly 


Leu 


Gly 








245 










250 










255 


Arg Arg 


Glu 


Tyr 


Gin 


Ser 


Leu 


Gin 


His 


Arg 


His 


His 


Ser 


Gin 


Arg 








260 










265 










270 


Ser Lys 


Cys 


Arg 


Pro 


Pro 


Lys 


Gly Met Tyr 


Leu 


Thr 


Gin 


Glu 


Asp 
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275 










oon 










285 


Val 


Val 


Ala 


Val 


Ser 
290 


Cys 


Ser 


Pro 


Asn 


Ala 
not: 


Ala 


Asn 


Thr 


He 


Leu 
300 


Arg 


Gin 


Leu 


Asp Met 


Glu 


Leu 


He 


Ser 


Leu 


Lys 


Arg 


Gin 


Val 


Gin 




















o x \J 










315 


Asn 


Ala 


Lys 


Gin 


Val 
"a o n 


Asn 


Co-r 


Ala 


Leu 


Lys 
325 


Gin 


Lys 


Met 


Glu 


Glv 
330 


Gly 


xie 


Glu 


Glu 


.f ne 


Lys 


Pro 


Pro 




Ser 
340 


Asn 


Gin 


Lys 


He 


Asn 
345 


Ala 


Arg 


Trp 


Thr 


Thr 


Glu 


Glu 


Gin 


Leu 


Leu 
355 


Ala 


Val 


Gin 


Gly 


Val 
360 


Arg 


Lys 


Tyr Gly 


Lys 


ASp 


Phe 


Gin 


Ala 


He Ala Asp Val 


Tie 

_1_ _L fc= 


Gly 










s c c 










370 










375 


Asn 


Lys 


Thr Val 


Gly 


bin 


Val 


Lys 


Asn 


Phe 


Phe 


Val 


Asn 


xyi: 


Arg 










o o rv 










385 










390 


Arg 


Arg 


Phe 


Asn 


Leu 

-y rs C 

39 b 


Glu 


Glu 


Val 


Leu 


Gin 
400 


Glu 


Trp 


Glu 


Ala 


Glu 
405 


Gin 


Gly 


Thr 


Gin 


Ala 


Ser 


Asn Gly Asp Ala 


Ser 


Thr 


Leu 


Gly 


Glu 










A 1 A 










415 










420 


Glu 


Thr 


Lys 


Ser 


Ala 


Ser 


Asn 


Val 


Pro 


Ser Gly 


Lys 


Ser 


Thr 


Asp 










/IOC 










430 










435 


Glu 


Glu 


Glu 


Glu 


Ala 
440 


Gin 


Thr 


Pro 


Gin 


Ala 
445 


Pro 


Arg 


Thr 


Leu 


Gly 
450 


Pro 


Ser 


Pro 


Pro 


Ala 

Add 


Pro 


Ser 


Ser 


Thr 


Pro 
460 


Thr 


Pro 


Thr 


Ala 


Pro 
465 


He 


Ala 


Thr 


Leu 


Asn 
470 


Gin 


Pro 


Pro 


Pro 


Leu 

475 


Leu 


Arg 


Pro 


Thr 


Leu 
480 


Pro 


Ala 


Ala 


Pro 


Ala 
485 


Leu 


His 


Arg 


Gin 


Pro 
490 


Pro 


Pro 


Leu 




Gin 
495 


Gin 


Ala 


Arg 


Phe 


He 
500 


Gin 


Pro 


Arg 


Pro 


Thr 
505 


Leu 


Asn 


Gin 


Pro 


Pro 


Pro 


Pro 


Leu 


He 


Arg 
515 


Pro 


Ala 


Asn 


Ser 


Met 
520 


Pro 


Pro 


Arg 


Leu 


Asn 
525 


Pro 


Arg 


Pro 


Val 


Leu 


Ser Thr Val Gly Gly Gin 


Gin 


Pro 


Pro 


Ser 










530 










535 










540 


Leu 


He 


Gly 


He 


Gin 
545 


Thr 


Asp 


Ser 


Gin 


Ser 
550 


Ser 


Leu 


His 







<210> 21 

<211> 371 

<212> PRT 

<213> Homo sapiens 

<220> 

<2 21> misc_f eature 

<223> Incyte ID No: 103135CD1 

<400> 21 

Met Asp Met Ala Gin Glu Pro Val Thr Phe Arg Asp Val Ala He 

15 10 15 

Tyr Phe Ser Arg Glu Glu Trp Ala Cys Leu Glu Pro Ser Gin Arg 

20 25 30 

Ala Leu Tyr Arg Asp Val Met Leu Asp Asn Phe Ser Ser Val Ala 

35 40 45 

Ala Leu Gly Phe Cys Ser Pro Arg Pro Asp Leu Val Ser Arg Leu 

50 55 60 
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Glu 


Gin 


Trr> 


Glu 


Glu 


Pro 


Trp Val 


Glu 


Asp Arg 


Glu Arg 


Pro 


Glu 










65 










70 






75 


Phe 


Gin 


Ala 


Val 


Gin 


Arg 


Gly 


Pro 


Arg 


Pro Gly Ala Arg 


Lys 


Ser 










80 










85 






90 


Ala 


Asp 


Pro 


Lys 


Arg 


His 


Cys 


Asp 


His 


Pro Ala 


Trp Ala 


His 


Lys 










95 










100 






105 


Lys 


Thr 


His 


Val 


Arg 


Arg 


Glu 


Arg 


Ala 


Arg Glu 


Gly Ser 


Ser 


Phe 








110 










115 






120 


Arg Lys Gly Phe Arg Leu Asp Thr Asp Asp Gly Gin Leu 


Pro 


Arg 










19c 










J- -J 






135 


Ala 


Ala 


Pro 


pi mm 


Arg 


inr 


Asp 


nla 


Lys 


Prn Thr 

JT J. W XAAJ- 


Ala Phe 


Pro 


Cys 










J. ft u 










145 






150 


Gin 


Val 


Leu 


inr 


pi 
vj±n 


Arg 


Cys 


Gly 


Arg 




Gly Arg 


Arg 


Glu 










1 cc 
IJJ 










160 






165 


Arg 


Arg 


Lys 


Ljj_n 


Arg 


Ala 


val 


Glu 


Leu 


Cot* "P Vl 
OCX ■Tile 


lie Cys 


Gly Thr 










1 7 n 
1 / u 










175 






180 


Cys Gly Lys 


7v "1 —» 


Leu 




Cys 


His 


Ser 




Leu Ala 


His 


Gin 










1 0 D 










190 






195 


Thr 


Val 


His 


i nr 


vsiy 


1 nr 


Lys 


Ala 


Phe 


Glu Cys 


Pro Glu 


Cys 


Glv 




















205 






210 


Gin 


Thr 


Phe 


Arg 


Trp 


Ala 


Ser 


Asn 


Leu 


Gin Arg 


His Gin 


Ly s 


Asn 




















220 






225 


His 


Thr 


Arg 


pi , , 

GJLU 


Jjys 


Pro 


rne 


Cys 


Cys 


Glu Ala 


Cys Gly 


Gin 


Ala 










0 n 










235 






240 


Phe 


Ser 


Leu 


jjys 


Asp 


Arg 


Leu 


Ala 


Gin 


His Arg 


Lys Val 


His 


Thr 










Z *± D 










250 






255 


Glu 


His 


Arg 


Pro 


Tyr 


Ser 


Cys 


Gly Asp 


Cys Gly 


Lys Ala 


Phe 


Lys 










260 










265 






270 


Gin 


Lys 


Ser 


Asn 


Leu 


Leu 


Arg 


His 


Gin 


Leu Val 


His Thr 


Gly 


Glu 










275 










280 






285 


Arg 


Pro 


Phe 


Tyr 


Cys 


Ala 


Asp 


Cys 


Gly 


Lys Ala 


Phe Arg 


Thr 


Lys 










290 










295 






300 


Glu 


Asn 


Leu 


Ser 


His 


His 


Gin 


Arg 


Val 


His Ser Gly Glu 


Lys 


Pro 










305 










310 






315 


Tyr 


Thr 


Cys 


Ala 


Glu 


Cys 


Gly 


Lys 


Ser 


Phe Arg 


Trp Pro 


Lys 


Gly 










320 










325 






330 


Phe 


Ser 


lie 


His 


Arg 


Arg 


Leu 


His 


Leu 


Thr Lys 


Arg Phe 


Tyr 


Glu 










335 










340 






345 


Cys 


Gly His 


Cys 


Gly 


Lys 


Gly 


Phe 


Arg 


His Leu 


Gly Phe 


Phe 


Thr 










350 










355 






360 


Arg 


His 


Gin 


Arg 


Thr 


His 


Arg 


His 


Gly Glu Val 
















365 










370 









<210> 22 
<211> 837 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 1907346CD1 

<400> 22 

Met Leu Pro Lys Glu Glu Val Trp Lys Lys Arg Lys Arg Lys Glu 

15 10 15 

Lys Glu Ser Gly Met Ala Leu Thr Gin Val Arg Leu Thr Phe Arg 
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20 25 30 

Asp Val Ala lie Glu Phe Ser Gin Glu Glu Trp Lys Cys Leu Asp 
35 40 45 

Pro Ala Gin Arg lie Leu Tyr Arg Asp Val Met Leu Glu Asn Tyr 
50 55 60 

Trp Asn Leu Val Ser Leu Gly Leu Cys His Phe Asp Met Asn He 
65 70 75 

He Ser Met Leu Glu Glu Gly Lys Glu Pro Trp Thr Val Lys Ser 
80 85 90 

Cys Val Lys He Ala Arg Lys Pro Arg Thr Arg Glu Cys Val Lys 
95 100 105 

Gly Val Val Thr Asp He Pro Pro Lys Cys Thr He Lys Asp Leu 

110 115 120 

Leu Pro Lys Glu Lys Ser Ser Thr Glu Ala Val Phe His Thr Val 

125 130 135 

Val Leu Glu Arg His Glu Ser Pro Asp He Glu Asp Phe Ser Phe 

140 145 150 

Lys Glu Pro Gin Lys Asn Val His Asp Phe Glu Cys Gin Trp Arg 

155 160 165 

Asp Asp Thr Gly Asn Tyr Lys Gly Val Leu Met Ala Gin Lys Glu 

170 175 180 

Gly Lys Arg Asp Gin Arg Asp Arg Arg Asp He Glu Asn Lys Leu 

185 190 195 

Met Asn Asn Gin Leu Gly Val Ser Phe His Ser His Leu Pro Glu 

200 205 210 

Leu Gin Leu Phe Gin Gly Glu Gly Lys Met Tyr Glu Cys Asn Gin 

215 220 225 

Val Glu Lys Ser Thr Asn Asn Gly Ser Ser Val Ser Pro T eu Gin 

230 235 240 

Gin He Pro Ser Ser Val Gin Thr His Arg Ser Lys Lys Tyr His 

245 250 255 

Glu Leu Asn His Phe Ser Leu Leu Thr Gin Arg Arg Lys Ala Asn 

260 265 270 

Ser Cys Gly Lys Pro Tyr Lys Cys Asn Glu Cys Gly Lys Ala Phe 

275 280 285 

Thr Gin Asn Ser Asn Leu Thr Ser His Arg Arg He His Ser Gly 

290 295 300 

Glu Lys Pro Tyr Lys Cys Ser Glu Cys Gly Lys Thr Phe Thr Val 

305 310 315 

Arg Ser Asn Leu Thr He His Gin Val He His Thr Gly Glu Lys 

320 325 330 

Pro Tyr Lys Cys His Glu Cys Gly Lys Val Phe Arg His Asn Ser 

335 340 345 

Tyr Leu Ala Thr His Arg Arg He His Thr Gly Glu Lys Pro Tyr 

350 355 360 

Lys Cys Asn Glu Cys Gly Lys Ala Phe Arg Gly His Ser Asn Leu 

365 370 375 

Thr Thr His Gin Leu He His Thr Gly Glu Lys Pro Phe Lys Cys 

380 385 390 

Asn Glu Cys Gly Lys Leu Phe Thr Gin Asn Ser His Leu He Ser 

395 400 405 

His Trp Arg He His Thr Gly Glu Lys Pro Tyr Lys Cys Asn Glu 

410 415 420 

Cys Gly Lys Ala Phe Ser Val Arg Ser Ser Leu Ala He His Gin 

425 430 435 

Thr He His Thr Gly Glu Lys Pro Tyr Lys Cys Asn Glu Cys Gly 
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440 445 450 

Lys Val Phe Arg Tyr Asn Ser Tyr Leu Gly Arg His Arg Arg Val 
455 460 465 

His Thr Gly Glu Lys Pro Tyr Lys Cys Asn Glu Cys Gly Lys Ala 
470 475 480 

Phe Ser Met His Ser Asn Leu Ala Thr His Gin Val lie His Thr 
485 490 495 

Gly Thr Lys Pro Phe Lys Cys Asn Glu Cys Ser Lys Val Phe Thr 
500 505 510 

Gin Asn Ser Gin Leu Ala Asn His Arg Arg Met His Thr Gly Glu 
515 520 525 

Lys Thr Tyr Lys Cys Asn Glu Cys Gly Lys Ala Phe Ser Val Arg 
530 535 540 

Ser Ser Leu Thr Thr His Gin Ala lie His Ser Gly Glu Lys Pro 
545 550 555 

Tyr Lys Cys He Glu Cys Gly Lys Ser Phe Thr Gin Lys Ser His 
560 565 570 

Leu Arg Ser His Arg Gly He His Ser Gly Glu Lys Pro Tyr Lys 
575 580 585 

Cys Asn Glu Cys Gly Lys Val Phe Ala Gin Thr Ser Gin Leu Ala 
590 595 600 

Arg His Trp Arg Val His Thr Gly Glu Lys Pro Tyr Lys Cys Asn 
605 610 615 

Asp Cys Gly Arg Ala Phe Ser Asp Arg Ser Ser Leu Thr Phe His 
620 625 630 

Gin Ala He His Thr Gly Glu Lys Pro Tyr Lys Cys His Glu Cys 
635 640 645 

Gly Lys Val Phe Ai^ His Asn Ser Tyr Leu Ala Thr His Arg Arg 
650 655 660 

He His Thr Gly Glu Lys Pro Tyr Lys Cys Asn Glu Cys Gly Lys 
665 670 675 

Ala Phe Ser Met His Ser Asn Leu Thr Thr His Lys Val He His 
680 685 690 

Thr Gly Glu Lys Pro Tyr Lys Cys Asn Gin Cys Gly Lys Val Phe 
695 700 705 

Thr Gin Asn Ser His Leu Ala Asn His Gin Arg Thr His Thr Gly 
710 715 720 

Glu Lys Pro Tyr Arg Cys Asn Glu Cys Gly Lys Ala Phe Ser Val 
725 730 735 

Arg Ser Ser Leu Thr Thr His Gin Ala He His Thr Gly Lys Lys 
740 745 750 

Pro Tyr Lys Cys Asn Glu Cys Gly Lys Val Phe Thr Gin Asn Ala 
755 760 765 

His Leu Ala Asn His Arg Arg He His Thr Gly Glu Lys Pro Tyr 
770 775 780 

Arg Cys Thr Glu Cys Gly Lys Ala Phe Arg Val Arg Ser Ser Leu 
785 790 795 

Thr Thr His Met Ala He His Thr Gly Glu Lys Arg Tyr Lys Cys 
800 805 810 

Asn Glu Cys Gly Lys Val Phe Arg Gin Ser Ser Asn Leu Ala Ser 
815 820 825 

His His Arg Met His Thr Gly Glu Lys Pro Tyr Lys 
830 835 

<210> 23 
<211> 549 
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<212> PRT 

<213> Hoino sapiens 



<220> 

<221> misc_f eafcure 

<223> Incyte ID No: 3041036CD1 



<400> 23 



Met 


Ala 


Axa 


Gin 


Leu 


Leu Thr 


Asp 


Glu Ala Leu 


Glu 


Ser 


Val 


Thr 


1 








5 






10 








15 


Phe 


Arg 


Asp 


Val 


Thr 


Val Asp 


Phe 


Thr Gin Glu 


Glu 


Trp 


Gin 


Gin 










20 






25 








30 


Leu 


Glu 


Pro 


Ala 


Gin 


Lys Asp 


Leu 


Tyr Arg Asp 


Val 


Met 


Leu 


Glu 










35 






40 








45 


Asn Tyr 


Arg 


Asn 


Leu 


Val Ser Leu Asp Trp Glu Thr Arg 


Pro 


Glu 










50 






55 








60 


Met 


Lys 


GJLU 


Leu 


Asp 


Pro Lys 


Asn 


Asp He Ser 


Glu 


Asp 


Lys 


Leu 










65 






70 








75 


Ser 


vai 


vai 


Gly 


Glu 


Ala Thr 


Gly 


Gly Pro Thr 


Arg 


Asn 


Gly Ala 










80 






85 








90 


Arg 


Gly 


j*ro 


Gly 


Ser Glu Gly Val 


Trp Glu Pro 


Gly 


Ser 


Trp 


Pro 










95 






100 








105 


Glu 


Arg 


Pro 


Arg Gly Asp Ala Gly Ala Glu Trp 


Glu 


Pro 


Leu Gly 
















115 








120 


lie 


Pro 


Gin 


GJ.y 


Asn 


Lys Leu Leu Gly Gly Ser Val 


Pro 


Ala 


Cys 










IOC 






130 








135 


His 


Glu 


Leu 


Lys 


Ala 


Phe Ala 


Asn 


Gin Gly Cys 


Val 


Leu 


Val 


Pro 










140 






145 








150 


Pro 


Arg 


Leu 


Asp 


Asp 


Pro Thr 


Glu 


Lys Gly Ala 


Cys 


Pro 


Pro 


Val 










155 






160 








165 


Arg 


Arg 




Lys 


Asn 


Phe Ser 


Ser 


Thr Ser Asp 


Leu 


Ser 


Lys 


Pro 










170 






175 








180 


Pro 


Met 


Pro 


Cys 


Glu 


Glu Lys 


Lys 


Thr Tyr Asp 


Cys 




Glu 


Cys 










185 






190 








195 


Gly 


Lys 




Phe 


Ser 


Arg Ser 


Ser 


Ser Leu He 


Lys 


rllS 


Gin Arg 










200 






205 








210 


He 


His 


inr 


Gly 


Glu 


Lys Pro 


Phe 


Glu Cys Asp 


Thr 


Cys 


Gly 


Lys 










215 






220 








225 


His 


Phe 


x± e 


Glu 


Arg 


Ser Ser 


Leu 


Thr He His 


Gin 


Arg 


Val 


His 










230 






235 








240 


Thr Gly 


bill 


Lys 


Pro 


Tyr Ala 


Cys 


Gly Asp Cys 


Gly 


Lys 


Ala 


Phe 










245 






250 








255 


Ser 


Gin 


Arg 


Met 


Asn 


Leu Thr 


Val 


His Gin Arg Thr 


His 


Thr Gly 










260 






265 








270 


Glu 


Lys 


Pro 


Tyr 


Val 


Cys Asp 


Val 


Cys Gly Lys 


Ala 


Phe 


Arg 


Lys 










275 






280 








285 


Thr 


Ser 


Ser 


Leu 


Thr 


Gin His 


Glu 


Arg He His 


Thr 


Gly 


Glu 


Lys 










290 






295 








300 


Pro 


Tyr 


Ala 


Cys 


Gly 


Asp Cys 


Gly 


Lys Ala Phe 


Ser 


Gin 


Asn 


Met 










305 






310 








315 


His 


Leu 


He 


Val 


His 


Gin Arg 


Thr 


His Thr Gly Glu 


Lys 


Pro 


Tyr 










320 






325 








330 


Val 


Cys 


Pro 


Glu 


Cys 


Gly Arg 


Ala 


Phe Ser Gin 


Asn 


Met 


His 


Leu 










335 






340 








345 


Thr 


Glu 


His 


Gin 


Arg 


Thr His 


Thr Gly Glu Lys 


Pro 


Tyr 


Ala 


Cys 










350 






355 








360 
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Ol ,, 
ulU 


Cys 




Lys 
365 


Ala 


nic noil 


Lys 


Ser 
370 


Ser 


Ser 


Leu 


Thr 


Leu 
375 


His 


Gin 


Arg 


Asn 


His 
380 


Thr 


Gly Glu 


Lys 


Pro 
385 


Tyr 


Val 


Cys 


Gly 


Glu 
390 


Cys 


Gly 


Lys 


Ala 


Phe 
395 


Ser 


Gin Ser 


Ser 


Tyr 
400 


Leu 


He 


Gin 


His 


Gin 
405 


Arg 


Phe 


His 


He 


Gly Val 


Lys Pro 


Phe 


Glu 


Cys 


Ser 


Glu 


Cys 


Gly 










ai n 








415 

*i J > 










420 


Lys 


Ala 


"PVlo 


o t; j. 


T ire? 

A"? R 
f± Z D 


Asn 


Ser Ser 


Leu 


Thr 
430 


Gin 


His 


Gin 


Arg 


He 
435 


His 


Thr 


yj±y 


ur_LU 


Lys 
440 


Pro 


Tyr Glu 


Cys 


Tyr 
445 


He 


Cys 


Lys 


Lys 


His 
450 


Phe 


Thr 


nl \r 
\j3±y 


Arg 


Ser 
455 


Ser 


Leu He 


Val 


His 
460 


Gin 


He 


Val 


His 


Thr 
465 


Gly Glu 


jjys 


fro 


Tyr 


Val 


Cys Gly Glu Cys Gly Lys Ala Phe 


Ser 










470 








475 










480 


Gin 


Ser 


AJ.a 


Tyr 


Leu 


He 


Glu His 


Gin 


Arg 


He 


His 


Thr Gly 


Glu 










485 








490 










495 


Lys 


Pro 


Tyr 


Arg 


Cys 


Gly 


Gin Cys 


Gly Lys 


Ser 


Phe 


He 


Lys 


Asn 










500 








505 












Ser 


Ser 


Leu 


Thr 


Val 


His 


Gin Arg 


He 


His 


Thr 


Gly Glu 


Lys 


Pro 










515 








520 










525 


Tyr Arg 


Cys 


Gly 


Glu 


Cys 


Gly Lys 


Thr 


Phe 


Ser 


Arg 


Asn 


Thr 


Asn 










530 








535 










540 


Leu 


Thr 


Arg 


His 


Leu 
545 


Arg 


He His 


Thr 















<210> 24 
<211> 555 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> mi sc_ feature 

<223> Incyte ID No: 3856879CD1 

<400> 24 



Met 


Ala 


Ala 


Ala 


Arg 


Leu 


Leu 


Pro 


Val 


Pro 


Ala Gly Pro 


Gin 


Ala 


1 








5 










10 






15 


Lys 


Leu 


Thr 


Phe 


Glu 


Asp 


Val 


Ala 


Val 


Leu 


Leu Ser Gin 


Asp 


Glu 










20 










25 






30 


Trp 


Asp 


Arg 


Leu 


Cys 


Pro 


Ala Gin Arg Gly Leu Tyr Arg Asn 


Val 










35 










40 






45 


Met 


Met 


Glu 


Thr 


Tyr 


Gly 


Asn 


Val 


Val 


Ser 


Leu Gly Leu 


Pro 


Gly 










50 










55 






60 


Ser Lys 


Pro 


Asp 


He 


He 


Ser 


Gin 


Leu 


Glu Arg Gly Glu Asp 


Pro 










65 










70 






75 


Trp 


Val 


Leu 


Asp 


Arg 


Lys 


Gly 


Ala 


Lys- 


Lys 


Ser Gin Gly 


Leu 


Trp 










80 










85 






90 


Ser 


Asp 


Tyr 


Ser 


Asp 


Asn 


Leu 


Lys 


Tyr 


Asp 


His Thr Thr 


Ala 


Cys 










95 










100 






105 


Thr 


Gin 


Gin 


Asp 


Ser 


Leu 


Ser 


Cys 


Pro 


Trp 


Glu Cys Glu 


Thr 


Lys 










110 










115 






120 


Gly Glu 


Ser 


Gin 


Asn 


Thr 


Asp 


Leu 


Ser 


Pro 


Lys Pro Leu 


He 


Ser 










125 










130 






135 


Glu 


Gin 


Thr 


Val 


He 


Leu 


Gly Lys 


Thr 


Pro 


Leu Gly Arg 


He 


Asp 



41/101 



BNSDOCID: <WO 03000864A2_I_> 



WO 03/000864 



PCI7US02/21179 









140 








145 








150 


Gin Glu 


Asn 


Asn 


Glu 


Thr 


Lys 


Gin Ser 


Phe 


Cys Leu 


Ser 


Pro 


Asn 








155 








160 








165 


Ser Val 


Asp 


His 


Arg 


Glu 


Val 


Gin Val 


Leu 


Ser Gin 


Ser 


Met 


Pro 








170 








175 








180 


Leu Thr 


Pro 


His 


Gin 


Ala 


Val 


Pro Ser 


Gly 


Glu Arg 


Pro 


Tyr 


Met 








185 








190 








195 


Cys Val 


Glu 


Cys 


Gly 


Lys 


Cys 


Phe Gly Arg 


Ser Ser His 


Leu 


Leu 








200 








205 








210 


Gin His 


Gin 


Arg 


He 


His 


Thr Gly Glu 


Lys 


Pro Tyr Val 


Cys 


Ser 








215 








220 








225 


Val Cys 


Glv 


Lvs 


Ala 


Phe 


Ser 


Gin Ser 


Ser 


Val Leu 


Ser 


Lys 


His 








230 








235 








240 


Aircr Arcr 


lie 


His 


Thr 


Gly Glu 


Lys Pro 


Tyr 


Glu Cys 


Asn 


Glu 


Cys 








245 








250 








255 


Gly Lys 


Ala 


Phe 


Arg 


Val 


Ser 


Ser Asp 


Leu 


Ala Gin 


His 


His 


Lys 








260 








265 








270 


lie His 


Thr 


Gly 


Glu 


Lys 


Pro 


His Glu 


Cys 


Leu Glu 


Cys 


Arg 


Lys 








275 








280 








285 


Ala Phe 


Thr 


Gin 


Leu 


Ser 


His 


Leu He 


Gin 


His Gin 


Arg 


He 


His 








290 








295 








300 


Thr Gly 


Glu 


Arg 


Pro 


Tyr Val 


Cys Pro 


Leu 


Cys Gly Lys 


Ala 


Phe 








305 








310 








315 


Asn His 


Ser 


Thr 


Val 


Leu 


Ara 


Ser His 


Gin 


Arg Val 


His 


Thr 


Gly 








320 








325 








330 


Glu Lys 


Pro 


His 


Arg 


Cys 


Asn 


Glu Cys 


Gly 


Lys Thr 


Phe 


Ser 


Val 








335 








340 








345 


Lys Arg 


Thr 


Leu 


Leu 


Gin 


His 


Gin Arg 


He 


His Thr Gly 


Glu 


Lys 








350 








355 








360 


Pro Tyr 


Thr 


Cys 


Ser 


Glu 


Cys 


Glv Lvs 


Ala 


Phe Ser Asp 


Arg 


Ser 








365 








370 








375 


Val Leu 


lie 


Gin 


His 


His 


Asn 


Val His 


Thr 


Gly Glu Lys 


Pro 


Tyr 








380 








385 








390 


Glu Cys 


Ser 


Glu 


Cys 


Gly 


Lys 


Thr Phe 


Ser 


His Arg 


Ser 


Thr 


Leu 








395 








400 








405 


Met Asn 


His 


Glu 


Arg 


He 


His 


Thr Glu 


Glu 


Lys Pro 


Tyr 


Ala 


Cys 








410 








415 








420 


Tyr Glu 


Cvs 


Gly 


Lys 


Ala 


Phe 


Val Gin 


His 


Ser His 


Leu 


He 


Gin 








425 








430 








435 


His Gin 


Arg 


Val 


His 


Thr 


Gly 


Glu Lys 


Pro 


Tyr Val 


Cys 


Gly 


Glu 








440 








445 








450 


Cys Gly 


His 


Ala 


Phe 


Ser 


Ala 


Arg Arg 


Ser 


Leu He 


Gin 


His 


Glu 








455 








460 








465 


Arg lie 


His 


Thr 


Gly 


Glu 


Lys 


Pro Phe 


Gin 


Cys Thr 


Glu 


Cys 


Gly 








47 0 








475 








480 


Lys Ala 


Phe 


Ser 


Leu 


Lys 


Ala 


Thr Leu 


He 


Val His 


Leu 


Arg 


Thr 








485 








490 








495 


His Thr Gly Glu Lys 


Pro 


Tyr 


Glu Cys 


Asn 


Ser Cys 


Gly 


Lys 


Ala 








500 








505 








510 


Phe Ser 


Gin 


Tyr 


Ser 


Val 


Leu 


He Gin 


His 


Gin Arg 


He 


His 


Thr 








515 








520 








525 


Gly Glu 


Lys 


Pro 


Tyr 


Glu 


Cys 


Gly Glu 


Cys 


Gly Arg Ala 


Phe 


Asn 








530 








535 








540 


Gin His 


Gly 


His 


Leu 


He 


Gin 


His Gin 


Lys 


Val His 


Arg 


Lys 


Leu 








545 








550 








555 



42/101 



BNSDOCID: <WO 03000864A2J_> 



WO 03/000864 



PCT/US02/21179 



<210> 25 

<211> 601 

<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 4178665CD1 



<400> 25 



Met 


Leu 


Cys 


Trp 


Leu 


Gin 


Glu Asn 


Asn 


-tilt; 


Cys 


Leu 


Leu 


Leu 


Cys 


1 








5 








1 u 










15 


Phe 


Leu 


Ser 


Gly 


Leu 


Leu 


Ser Arg 


nlS 


Lys 


Thr 


Lys 


Lys 


Leu 


Ser 










20 








Z. D 










30 


Ser 


Glu 


Lys 


Asp 


He 


His 


vilu lie 


oer 


Leu 


Ser 


Lys 


Glu 


Ser 


He 










35 








a n 










A £ 
ft — > 


lie 


Glu 


Lys 


Ser 


Lys 


Thr 


Leu Arg 


Leu 


iiys 


Gly 


Ser 


He 


Phe 












50 




















Asn 


Glu 


Trp 


Gin 


Asn 


Lys 


Ser Glu 


rne 


bill 


Gly Gin 


Gin Gly 


Leu 










65 








/ u 










/ j 


Lys 


Glu 


Arg 


Ser 


He 


Ser 


Gin Lys 


Lys 


lie 


Val 


Ser 


Lys 


Lys 


Met 










80 








Q C 

o D 










Q n 
^ \j 


Ser 


Thr 


Asp 


Arg 


Lys 


Arg 


Pro Ser 


irne 


inr 


Leu 


Asn 


Gin 


Arg 


He 










95 








1UU 










X u _> 


His 


Asn 


Ser 


Glu 


Lys 


Ser 


Cys Asp 


Ser 


ills 


Leu 


Val 


Gin 


His 


CXI \r 










110 








lib 










iz u 


Lys 


He 


Asp 


Ser Asp 


Val 


Lys His 


Asp 


cys 


Lys 


Glu 


Cys 


Gly 


Ser 










125 








130 










i j j 


Thr 


Phe 


Asn 


Asn 


Val 


Tyr 


Gin Leu 


Thr 


Leu 


His 


Gin 


Lys 


He 


nib 










140 








1 A C 

14b 










JL3U 


Thr Gly Glu 


Lys 


Ser 


Cys 


Lys Cys 


Glu 


Lys 


Cys 


Gly 


Lys 


Val 


Phe 










155 








160 










165 


Ser 


His 


Ser 


Tyr Gin 


Leu 


Thr Leu 


His 


Gin 


Arg 


Phe 


His 


Thr 


Gly 










170 








175 










180 


Glu 


Lys 


Pro 


Tyr Glu 


Cys 


Gin Glu 


Cys 


Gly 


Lys 


Thr 


Phe 


Thr 


Leu 










185 








190 










195 


Tyr 


Pro 


Gin 


Leu 


Asn 


Arg 


His Gin 


Lys 


He 


His 


Thr Gly Lys 


Lys 










200 








205 










210 


Pro 


Tyr 


Met 


Cys 


Lys 


Lys 


Cys Asp 


Lys 


Gly 


Phe 


Phe 


Ser 


Arg 


Leu 










215 








220 










225 


Glu 


Leu 


Thr 


Gin 


His 


Lys 


Arg He 


His 


Thr 


Gly Lys 


Lys 


Ser 


Tyr 










230 








235 










240 


Glu 


Cys 


Lys 


Glu 


Cys 


Gly 


Lys Val 


Phe 


Gin 


Leu 


He 


Phe 


Tyr 


Phe 










245 








250 










255 


Lys 


Glu 


His 


Glu 


Arg 


He 


His Thr Gly 


Lys 


Lys 


Pro 


Tyr 


Glu 


Cys 










260 








265 










270 


Lys 


Glu 


Cys 


Gly Lys 


Ala 


Phe Ser Val 


Cys Gly Gin Leu Thr Arg 










275 








280 










285 


His 


Gin 


Lys 


He 


His 


Thr 


Gly Val 


Lys 


Pro 


Tyr 


Glu 


Cys 


Lys 


Glu 










290 








295 










300 


Cys 


Gly Lys 


Thr 


Phe 


Arg 


Leu Ser 


Phe 


Tyr 


Leu 


Thr 


Glu 


His 


Arg 








305 








310 










315 


Arg 


Thr 


His 


Ala Gly 


Lys 


Lys Pro 


Tyr 


Glu 


Cys 


Lys 


Glu 


Cys 


Gly 










320 








325 










330 


Lys 


Ser 


Phe 


Asn 


Val 


Arg 


Gly Gin Leu 


Asn 


Arg 


His 


Lys 


Thr 


He 










335 








340 










345 



43/101 



BNSDOCID: <WO 03000864A2_I_> 



WO 03/000864 



PCT/US02/21179 



His 


Thr 




He 


Lys 


Pro' Phe 




Cys 


Lys 


Val 




Glu 


Lys 


Ala 










350 










355 










360 


Phe 


Ser 


Tyr 


Ser Gly Asp 


Leu 


Arg 


Val 


His 


Ser 


Arg 


He 


His 


Thr 










365 










370 










375 


Gly Glu 


Lys 


Pro 


Tyr 


Glu 


Cys 


Lys 


Glu 


Cys 


Gly 


Lys 


Ala 


Phe 


Met 










Ton 
oou 










385 










390 


Leu 


Arg 


Ser 


vai 


Leu 


Thr 


Glu 


His 


Gin 


Arg 


Leu 


His 


Thr 


Gly Val 










395 










400 










405 


Lys 


Pro 


Tyr 


bill 


Cys 


Lys Glu Cys Gly Lys Thr Phe Arg Val Arg 










*tJLU 










415 










f± z. u 


Ser 


Gin 


lie 


Ser 


T All 

Leu 


His 


Lys 


Lys 


He 


His 
430 


Thr 


Asp 


VdX 


Lys 


Pro 
435 


Tyr 


Lys 


Cys 


Val 


Arg 
440 


Cys 


Gly 


Lys 


Thr 


Phe 
445 


Arg 


Phe 




Phe 


Tyr 
450 


Leu 


Thr 


VjJ-U 


Has 


Gin 


Arg 


He 


His 


Thr Gly Glu Lys 


Jrro 


Tyr 


Lys 










455 










460 










465 


Cys 


Lys 


bill 


Cys 


Gly 
470 


Lys 


Ala 


Phe 


He 


Arg 
475 


Arg 


Gly 


Asn 


Leu 


Lys 
480 


Glu 


His 


Leu 


Lys 


He 


His 


Ser Gly 


Leu 


Lys 


Pro 


Tyr 


Asp 


Cys 


Lys 










485 










490 










495 


Glu 


Cys 


wly 


Lys 


Ser 


Phe 


Ser 


Arg 


Arg Gly Gin 


Phe 


Thr 


Glu 


His 










500 










505 










510 


Gin 


Lys 


lie 


His 


Thr Gly Val 


Lys 


Pro 


Tyr 


Lys 


Cys 


Lys 


Glu 


Cys 










515 










520 










525 


Gly 


Lys 


Ala 


Phe 


Ser Arg 


Ser 


Val 


Asp 


Leu 


Arg 


He 


His 


Gin 


Arg 










530 










535 










540 


lie 


His 


Thr 


Gly 


Glu 


Lys 


Pro 


Tyr 


Glu 


Cys 


Lys 


Gin 


Cys 


Gly Lys 










545 










550 










555 


Ala 


Phe 


Arg 


Leu 


Asn 
560 


Ser 


His 


Leu 


Thr 


Glu 
565 


His 


Gin 


Arg 


He 


His 
570 


Thr 


Gly 


Glu 


Lys 


Pro 
575 


Tyr 


Glu 


Cys 


Lys 


Val 
580 


Cys 


Arg 


Lys 


Ala 


Phe 
585 


Arg 


Gin 


Tyr 


Ser 


His 
590 


Leu 


Tyr 


Gin 


His 


Gin 
595 


Lys 


Thr 


His 


Asn 


Val 
600 



He 



<210> 26 
<211> 743 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 7493326CD1 

<400> 26 

Met Met Gin Ala Gin Glu Ser Leu Thr Leu Glu Asp Val Ala Val 
1 '5 10 15 

Asp Phe Thr Trp Glu Glu Trp Gin Phe Leu Ser Pro Ala Gin Lys 

20 25 30 

Asp Leu Tyr Arg Asp Val Met Leu Glu Asn Tyr Ser Asn Leu Val 

35 40 45 

Ala Val Gly Tyr Gin Ala Ser Lys Pro Asp Ala Leu Ser Lys Leu 

50 55 60 

Glu Arg Gly Glu Glu Thr Cys Thr Thr Glu Asp Glu He Tyr Ser 



44/101 



BNSDOCID: <WO 03000864A2_I_> 



WO 03/000864 



PCT/US02/21179 











65 






70 










75 


Arcj 


IIS 


Cys 






Ser Gly Gly Ala 


Ser Gly Gly Ala 


TVr 


Ala 










80 






85 










90 


Cl~\ mm 

VJJ.U 


J. -L t= 


Arg 


T .vc 


He 


Asp Asp Pro 


Leu 


Gin 


His 


His 


Leu 


Gin 


Asn 










95 






100 










105 


rxl t*i 
Lrin 




Tl <=- 

JL JL t? 




Lys 


Ser Val Lys 


Gin 


Cys 


His 


Glu 


Gin 


Asn 


Met 










X -L \J 






115 










120 




vjiy 


Asn 


He 


Val 


Asn Gin Asn Lys Gly His Phe 


Leu 


Leu 


Lvs 










125 






i 

Xj U 










135 


nl _ 


Asp 


Cys 


Asp 


Thr 


Phe Asp Leu 


His 


/"2l it 


Lys 


Pro 


Leu 


Lvs 


Ser 










140 
















150 


Asn 


Leu. 


Ser 


Phe 


Glu 


Asn Gin Lys 


Arg 


Ser 


Ser Gly Leu 


Lys 


Asn 










155 






lbU 










165 


Ser 


Ma 


blU 


Phe Asn Arg Asp Gly Lys 




Leu 


±r lie 


His 


Ala 


Asn 










170 






I/O 










180 


TT,' _ 


Lys 


uzin 


Phe 


Tyr 


Thr Glu Met 


Lys 


Phe 


Pro 


Ala 


He 


Ala 


Lys 










185 






190 










195 


irlTO 


lie 


Asn 


Lys 


Ser 


Gin Phe He 


Lys 


Gin 


Gin 


Arg 


Thr 


His 


Asn 










200 






205 










210 


lie 


blU 


Asn 


Ala 


His 


Val Cys Ser 


Glu 


Cys 


Gly 


Lys 


Ala 


Phe 


Leu 










215 






220 










225 


Lys 


T All 

Leu 


Ser 


Gin 


Phe 


He Asp His 


Gin 


Arg 


Val 


xilS 


Thr Gly 


Glu 










230 






235 










240 


Lys 


Pro 


111 s 


Val 


Cys 


Ser Met Cys 


Gly 


Lys 


Ala 


Pne 


Ser 


Arg 


Lys 










245 






250 










255 


Ser 


Arg 


Leu 


Met 


Asp 


His Gin Arg 


Thr 


His 


Thr 


Glu 


Leu 


i»ys 


His 










260 






265 










270 


Tyr 


tslU 


Cys 


Thr 


Glu 


Cys Asp Lys 


Thr 


Phe 


Leu 


Lys 


Lys 


Ser 


Gin 










275 






280 










285 


Leu 


Asn 


lie 


His 


Gin 


Lys Thr His 


Met 


Gly Gly 


Lys 


Pro 


Tyr 


Thr 










290 






295 










300 


Cys 


Ser 


Gin 


Cys 


Gly 


Lys Ala Phe 


He 


Lys 


Lys 


Cys 


Arg 


Leu 


He 










305 






310 










315 


Tyr 


His 


Gin 


Arg 


Thr 


His Thr Gly Glu 


Lys 


Pro 


His 


Gly 


Cys 


Ser 










320 






325 










330 


Val 


Cys 


Gly Lys 


Ala 


Phe Ser Thr Lys 


Phe 


Ser 


Leu 


Thr 


Thr 


His 










335 






340 










— > *± ~t 


Gin 


Lys 


Thr 


His 


Thr Gly Glu Lys 


Pro 


Tyr 


He 


Cys 


Ser 


Glu 


Cys 










350 






355 










360 


Gly Lys 


Gly 


Phe 


He 


Glu Lys Arg 


Arg 


Leu 


Thr 


Ala 


His 


His 


Arg 










365 






370 










375 


Thr 


His 


Thr Gly 


Glu 


Lys Pro Phe 


He 


Cys 


Asn 


Lys 


Cys 


Gly 


Lys 










380 






385 










390 


Gly Phe Thr Leu 


Lys 


Asn Ser Leu 


He 


Thr 


His 


Gin 


Gin 


Thr 


His 










395 






400 










405 


Thr Gly 


Glu 


Lys 


Leu 


Tyr Thr Cys 


Ser 


Glu 


Cys 


Gly Lys 


Gly 


Phe 










410 






415 










420 


Ser 


Met 


Lys 


His 


Cys 


Leu Met Val 


His 


Gin 


Arg 


Thr 


His 


Thr 


Glv 










425 






430 










435 


Glu 


Lys 


Pro 


Tyr 


Lys 


Cys Asn Glu 


Cys 


Gly 


Lys 


Gly 


Phe 


Ala 


Leu 










440 






445 










450 


Lys 


Ser 


Pro 


Leu 


He 


Arg His Gin 


Arg 


Thr 


His 


Thr Gly 


Glu 


Lys 










455 






460 










465 


Pro 


Tyr 


Val 


Cys 


Thr 


Glu Cys Arg 


Lys 


Gly 


Phe 


Thr 


Met 


Lys 


Ser 










470 






475 










480 


Asp 


Leu 


He 


Val 


His 


Gin Arg Thr 


His 


Thr 


Ala 


Glu 


Lys 


Pro 


Tyr 



45/101 



BNSDOCID: <WO 03000864A2_I_> 



WO 03/000864 



PCT/US02/21179 







485 








490 










49b 


lie Cys Asn 


Asp 


Cys 


Gly Lys 


Gly 


Phe 


Thr Val 


Lys 


Ser 


Arg 


Leu 






500 








505 










ri ft 

blO 


lie Val His 


Gin 


Arg 


Thr His 


Thr Gly Glu 


Lys 


Pro 


Tyr 


Val 


Cys 






515 








520 










c o c 
D^D 


Gly Glu Cys 


Gly Lys 


Gly Phe 


Pro 


Ala 


Lys 


i le 


Arg 


Leu 


wet 


lily 






530 








535 










540 


His Gin Arg 


Thr 


His 
545 


Thr Gly 


Glu 


Lys 


Pro 
550 


Tyr 


lie 


Cys 


Asn 


VjlU 
c c n 

bbb 


Cys Gly Lys 


Gly 


Phe 
560 


Thr Glu 


Lys 


Ser 


His 
565 


Leu 


Asn 


Val 


His 


Arg 
570 


Arg Thr His 


Thr Gly Glu Lys 


Pro 


Tyr Val 


Cys 


Ser 


Glu 


Cys 


Gly 






575 








580 










bob 


Lys Gly Leu 


Leu Gly Arg Ala 


Cys 


Ser 


Leu 


HIS 


His 


Gin 


Ala 


Asn 






590 








595 










600 


Ser Tyr Trp 


Gly 


Glu 
605 


Lys Pro 


Tyr 


lie 


Cys 
610 


Asn 


Glu 


Cys 


Gly 


Lys 
615 


Gly Phe Ser 


Met 


Lys 
620 


Ser Thr 


Leu 


Ser 


lie 
625 


His 


Gin 


Gin 


Thr 


His 
630 


Thr Gly Glu 


Lys 


Pro 
635 


Tyr Lys 


Cys 


Asn 


Glu 
640 


Cys 


Asp 


Lys 


Thr 


Phe 
645 


Arg Lys Lys 


Thr 


Cys 


Leu lie 


Gin 


His 


Gin 


Arg 


Phe 


His 


Thr Gly 






650 








655 










660 


Lys Thr Ser 


Phe 


Ala 
665 


Cys Thr 


Glu 


Cys 


Gly 
670 


Lys 


Phe 


Ser 


Leu 


Arg 
675 


Lys Asn Asp 


Leu 


lie 


Thr His 


Gin 


Arg 


He 


His Thr Gly 


Glu 


Lys 






680 








685 










690 


Pro Tyr Lys 


Cys 


Ser 


Asp Cys 


Gly Lys 


Ala 


Phe 


Thr 


Thr 


Lys 


Ser 






695 








700 










705 


Gly Leu Asn 


Val 


His 


Gin Arg 


Lys 


His 


Thr Gly Glu Arg 


Pro 


Tyr 






710 








715 










720 


Gly Cys Ser 


Asp 


Cys 
725 


Gly Lys 


Ala 


Phe 


Ala 
730 


His 


Leu 


Ser 


He 


Leu 
735 


Val Lys His 


Lys 


Arg 
740 


lie His 


Arg 

















<210> 27 
<211> 490 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 1553836CD1 

<400> 27 

Met Lys Met Arg Arg He Lys Pro Ala Ala Thr Ser His Val Glu 

15 10 15 

Gly Ser Gly Gly Val Ser Ala Lys Gly Lys Arg Lys Pro Arg Gin 

20 25 30 

Glu Glu Asp Glu Asp Tyr Arg Glu Phe Pro Gin Lys Lys His Lys 

35 40 45 

Leu Tyr Gly Arg Lys Gin Arg Pro Lys Thr Gin Pro Asn Pro Lys 

50 55 60 

Ser Gin Ala Arg Arg He Arg Lys Glu Pro Pro Val Tyr Ala Ala 

65 70 75 

46/101 



BNSDOCID: <WO 03000864A2_L> 



WO 03/000864 PCT/US02/21179 



Glv 




Glu 


Glu 


Gin 


Trp Tyr 


Leu 


Glu 


He 


Val 


Asp Lys Gly 








80 










85 








90 


Ser 


Val Ser 


Cys 


Pro 


Thr 


Cys 


Gin 


Ala 


Val 


Gly Arg 


Lys Thr 


He 








95 










100 








105 


Glu 


Gly Leu 


Lys 


Lvs 


His 


Met 


Glu 


Asn 


Cvs 


Lys 


Gin 


Glu Met 


Phe 








110 










115 








120 


Thr 


Cys His 


His 


Cvs 


Glv 


Lys 


Gin 


Leu 


Ara 


Ser 


Leu Ala Gly Met 








125 










130 








135 


Lys Tyr His Val 


Met 


Ala 


Asn 


His 


Asn 


Ser 


Leu 


Pro 


He Leu 


Lys 








140 










145 








150 


Ala 


Gly Asp 


Glu 


lie 


A so 


Glu 


Pro 


Ser 


Glu 


Arg 


Glu 


Arg Leu 


Arg 








155 










160 








165 


Thr 


Val Leu 


Lys 


Arg 


Leu 


Gly Lys 


Leu 


Arg 


Cys 


Met 


Arg Glu 


Ser 








170 










175 








180 


Cys 


Ser Ser 


Ser 


Phe 


Thr 


Ser 


He 


Met 


Gly Tyr Leu Tyr His Val 








185 


















195 


Arg 


Lys Cys 


Gly Lys 


Gly 


Ala 


Ala 


Glu 


Leu 


LjJLU 


Lys 


Met Thr 


Leu 








200 


















210 


Lys 


Cys His 


His 


Cys 


Gly 


Lys 


Pro 


Tvr 


Arg 


oer 


Lys 


Ala Gly 


Leu 








215 










o o n 

Z Z V 








225 


Ala 


Tyr His 


Leu 


Arg 


Ser 


Glu 


His 


Glv 


Pro 


lie 




Phe Phe 


Pro 








230 










Z JJ 








240 


Glu 


Ser Gly Gin 


Pro 


Glu 


Cys 


Leu 


Lys 


GJ.U 


Met: 


Asn 


Leu Glu 


Ser 








245 










OCA 








255 


Lys Ser Gly Gly Arg Val 


Gin 


Arg 


Ara 


Ser 


Ala 


Lys 


He Ala 


Val 








260 










zoo 








270 


Tyr His Leu 


Gin 


Glu 


Leu 


Ala 


Ser 


Ala 


Gi.U 


Leu 


Aia 


Lys Glu 


Trp 








275 










280 








285 


Pro 


Lys Arg 


Lys 


Val 


Leu 


Gin 


Asp 


Leu 


Val 


Pro 


Asp 


Asp Arg 


Lys 








290 










295 








300 


Leu 


Lys Tyr 


Thr 


Arg 


Pro 


Gly Leu 


Pro 


Thr 


Phe 


Ser 


Gin Glu 


Val 








305 










310 








315 


Leu 


His Lys 


Trp 


Lys 


Thr 


Asp 


He 


Lys 


Lys 


Tyr 


His 


Arg He 


Gin 








320 










325 








330 


Cys 


Pro Asn 


Gin 


Gly 


Cys 


Glu 


Ala 


Val 


Tyr 


Ser 


Ser 


Val Ser Gly 








335 










340 








345 


Leu 


Lys Ala 


His 


Leu 


Gly 


Ser 


Cys 


Thr 


Leu 


Gly 


Asn 


Phe Val 


Ala 








350 










355 








360 


Gly Lys Tyr 


Lys 


Cys 


Leu 


Leu 


Cys 


Gin 


Lys 


Glu 


Phe 


Val Ser 


Glu 








365 










370 








375 


Ser 


Gly Val 


Lys 


Tyr 


His 


He 


Asn 


Ser 


Val 


His 


Ala 


Glu Asp 


Trp 








380 










385 








390 


Phe 


Val Val 


Asn 


Pro 


Thr 


Thr 


Thr 


Lys 


Ser 


Phe 


Glu 


Lys Leu 


Met 








395 










400 








405 


Lys 


lie Lys 


Gin 


Arg 


Gin 


Gin 


Glu 


Glu 


Glu 


Lys 


Arg 


Arg Gin 


Gin 








410 










415 








420 


His 


Arg Ser Arg 


Arg 


Ser 


Leu 


Arg 


Arg 


Arg 


Gin 


Gin 


Pro Gly 


He 








425 










430 








435 


Glu 


Leu Pro 


Glu 


Thr 


Glu 


Leu Ser Leu Arg Val Gly Lys Asp 


Gin 








440 










445 








450 


Arg 


Arg Asn 


Asn 


Glu 


Glu 


Leu 


Val 


Val 


Ser 


Ala 


Ser 


Cys Lys 


Glu 








455 










460 








465 


Pro 


Glu Gin 


Glu 


Pro 


Val 


Pro 


Ala 


Gin 


Phe 


Gin 


Lys 


Val Lys 


Pro 








470 










475 








480 


Pro 


Lys Thr 


Asn 


His 


Lys 


Arg 


Gly Arg 


Lys 
















485 










490 











47/101 



BNSDOCID: <WO 03000864A2_I_> 



WO 03/000864 



PCT/US02/21179 



<210> 28 
<211> 665 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 1908201CD1 

<400> 28 



Met Pro 


Leu 


Arg 


Asp 


Lys 


Tyr 


Cys 




X 11X 


Asp 


His 


His 


His 


His 


1 






c 
D 




















15 


Gly Cys 


Cys 


Glu 


Pro 


val 


Tyr 


He 


Leu 


m 1 1 


Pro 


Gly Asp 


Pro 


Pro 








Z U 










9 R 










30 


Leu Leu 


G In 


Gin 


Pro 


Leu 


Gin 


Thr 


Ser 


Lys 


Ser Gly 


Tl e> 

lie 


Gin 


Gin 




























45 


lie lie 


Glu 


Cy s 


Phe 


Arg 


Ser Gly 


Thr 


Lys 


n'} r-i 


Leu 


Lys 


His 


He 


















R R 










60 


Leu Leu 


Lys 


Asp 


Val 


ASp 


Thr 


He 


rue 




Cys 


Lys 


Leu 


{Vq 


At* ci 








bo 










70 










75 


Ser Leu 


Phe 


Arg 


Gly 


Leu 


Pro 


Asn 


Leu 




i nr 


nis 


Lys 


Lys 


Phe 


















R R 










90 


Tyr Cys 


Pro 


Pro 


Ser 


Leu 


Gin 


Met 


Asp 


Asp 


Asn 


Leu 


Pro 


Asp 


Val 


























105 


Asn Asp 


Lys 


Gin 


Ser 


ij-Ln 


Ala 


He 


Asn 


Asp 


Leu 


T on 


LjIU 


Ala 


He 








tin 










1 1 R 










120 


Tyr Pro 


Ser 


Val 


Asp 


Lys 


Arg 


Glu 


Tyr 




TT a 

lie 


Lys 


Leu 


Glu 


Pro 








125 










1j U 










135 


lie Glu 


Thr 


Asn 


Gin 


Asn 


Ala 


Val 


rile 


uxn 


Tyr 


Tl pa 

x le 


Ser 


Arg 


Thr 








140 










1 Ac 










150 


Asp Asn 


Pro 


He 


Glu 


Val 


Thr 


Glu 


bci 


Ser 




Thr 


Pro 


Glu 


Gin 






155 










i fin 










165 


Thr Glu 


Val 


Gin 


He 


Gin 


Glu 


Thr 


Ser 


X XIX 


bill 


Gin 


Ser 


Lys 


Thr 








170 










1 7 R 

JL / O 










180 


Val Pro 


Val 


Thr 


Asp 


Thr 


Glu 


Val 




X 11X7 


vai 


Glu 


Pro 


Pro 


Pro 








185 










X u 










195 


Val Glu 


lie 


Val 


Thr 


Asp 


Glu 


Val 


Ala 


Pro 


Thr 


Ser 


Asp 


Glu 


Gin 








200 










205 










210 


Pro Gin 


. Glu 


Ser 


Gin 


Ala 


Asp 


Leu 


Glu 


Thr 


Ser 


Asp 


Asn 


Ser 


Asp 








215 










220 










225 


Phe Gly His 


Gin 


Leu 


He 


Cys 


Cys 


Leu 


Cys 


Arg 


Lys 


Glu 


Phe 


Asn 








230 










235 










240 


Ser Arg 


Arg 


Gly Val 


Arg 


Arg 


His 


He 


Arg 


Lys 


Val 


His 


Lys 


Lys 








245 










250 










255 


Lys Met 


Glu 


Glu 


Leu 


Lys 


Lys 


Tyr 


He 


Glu 


Thr 


Arg 


Lys 


Asn 


Pro 








260 










265 










270 


Asn Gin 


Ser 


Ser 


Lys 


Gly Arg 


Ser 


Lys 


Asn 


Val 


Leu 


Val 


Pro 


Leu 








275 










280 










285 


Ser Arg 


Ser 


Cys 


Pro 


Val 


Cys 


Cys 


Lys 


Ser 


Phe 


Ala 


Thr 


Lys 


Ala 








290 










295 










300 


Asn Val 


Arg 


Arg 


His 


Phe 


Asp 


Glu 


Val 


His 


Arg 


Gly Leu 


Arg 


Arg 








305 










310 










315 


Asp Ser 


lie 


Thr 


Pro 


Asp 


He 


Ala 


Thr Lys 


Pro 


Gly Gin 


Pro 


Leu 








320 










325 










330 


Phe Leu 


Asp 


Ser 


He 


Ser 


Pro 


Lys 


Lys 


Ser 


Phe 


Lys 


Thr 


Arg 


Lys 








335 










340 










345 



48/101 



BNSDOCID: <WO 03000864A2_I_> 



WO 03/000864 



PCT/US02/21179 



Gin 


Lys 


Ser 


Ser 


Ser 


Lys 


Ala 


Glu 


Tyr Asn 


Leu 


Thr 


Ala 


Cys 


Lys 










350 










355 










360 


Cys 


Leu 


Leu 


Cys 


Lys 
365 


Arg 


Lys 


Tyr 


Ser 


Ser 
370 


Gin 


He 


Met 


Leu 


Lys 
375 


Arg 


Hi «s 


Met 


Gin 


He 


Val 


His 


Lys 


He 


Thr 


Leu 


Ser Gly Thr Asn 










380 










385 










390 


Ser 


Lys 


Arg 


Glu 


Lys Gly 


Pro 


Asn 


Asn 


Thr 


Ala 


Asn 


Ser 


Ser 


Glu 










395 










400 










405 


lie 


Lys 


Val 


Lys 


Val 
A1 n 

f± 1 U 


vj-LU 


Pro 


Ala 


Asp 


Ser 
415 


Val 


Glu 


Ser 


Ser 


Pro 
420 


Pro 


Ser 


He 


Thr 


TT 1 _ 
XllS 


Ser 


Pro 


Gin Asn Glu Leu Lys Gly Thr Asn 










425 










430 










435 


His 


Ser 


Asn 


Glu 


Lys 
a An 


Lys 


Asn 


i nr 


Pro 


Ala 
445 


Ala 


Gin 


Lys 


Asn 


Lys 
450 


Val 


Lys 


Gin 


Asp 


ber 
455 


blU 


Ser 


Pro 


Lys 


Ser 
460 


Thr 




Pro 


Ser 


Ala 
465 


Ala Gly Gly Gin 


\jXTL 


Lys 


1 XXX. 


Arg 


Lys 


Pro 


Lys 




Ser 


Ala 












470 










475 










480 


Phe 


Asp 


Phe 


Lys 


oin 

APR 
ft O 3 


Leu 


Tyr 


Cys 


Lys 


Leu 
490 


Cys 


T Arc 
i-tjf to 


AX7g 


Gin 


Phe 
495 


Thr 


Ser 


Lys 


Gin 


Asn 


Leu 


± hi. 


Lys 


His 


He 
505 


Glu 


Leu 


His 


Thr 


Asp 
510 


Gly Asn Asn 


He 


Tyr 


val 


Lys 


Fne 


Tyr 


Lys 


Cys 


Jrro 


Leu 


to 












515 










520 










525 


Tyr 


Glu 


Thr 


Arg 


Arg 
530 


T ve> 

Lys 


Arg 


Asp 


Val 


He 
535 


Arg 


T-H c= 
n j. o 


He 


Thr 


Val 
540 


Val 


His 


Lys 


Lys 


Ser 


Ser 


Aiy 


Tyr 


Leu Gly Lys 


Tl — 

lie 


inr 


Ala 


Ser 










545 










550 










555 


Leu 


Glu 


He 


Arg 


Ala 
560 


He 


T 

Lys 


Lys 


Pro 


He 
565 


Asp 


Phe 


Val 


Leu 


Asn 
570 


Lys 


Val 


Ala 


Lys 


Arg 
575 


Gly 


Pro 


Ser 


Arg 


Asp 
580 


Glu 


Ala 


Lys 


His 


Ser 
585 


Asp 


Ser Lys 


His 


Asp 


Gly 


Thr 


Ser 


Asn 


Ser 


Pro 


Ser 


Lys 


Lys 


Tyr 










590 










595 










600 


Glu 


Val 


Ala 


Asp 


Val 


Gly 


lie? 


Glu 


Val 


Lys Val 


Thr 


Lys 


Asn 


Phe 










605 










610 










615 


Ser 


Leu 


His 


Arg 


Cys 
620 


Asn 


Lys 


Cys 


Gly 


Lys 
625 


Ala 


Phe 


Ala 


Lys 


Lys 
630 


Thr 


Tyr 


Leu 


Glu 


His 
635 


His 


Lys 


Lys 


Thr 


His 
640 


Lys 


Ala 


Asn 


Ala 


Ser 
645 


Asn 


Ser 


Pro 


Glu 


Gly Asn 


Lys 


Thr 


Lys 


Gly 


Arg 


Ser 


Thr 


Arg 


Ser 










650 










655 










660 


Lys 


Ala 


Leu 


Val 


Trp 
665 























<210> 29 

<211> 570 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 2827615CD1 

<400> 29 

Met Ser Lys Asp Leu Val Thr Phe Gly Asp Val Ala Val Asn Phe 

49/101 



BNSDOCID: <WO O30O0864A2J_> 



WO 03/000864 



PCT/US02/21179 



1 




-j 






10 






15 


Ser Gin 


Glu 




Glu Tm Leu 


Asn 


Pro 


Ala 


Gin Arg Asn 


Leu 












25 






30 


Tyr Arg 


Lys 


Va 1 Met 


Leu Glu Asn 


Tvr 


Arg 


Ser 


Leu Val Ser 


Leu 






35 






40 






45 


Ala Gly Val 


Ser Val 


Ser Lys Pro 


Asp 


Val 


lie 


Ser Leu Leu 


Glu 






50 






55 






60 


win yj±y 


Lys 


Glu Pro 


Trp Met Val 


Lvs 


Lys 


Glu 


Gly Thr Arg 


Gly 






65 






70 






75 


±rDTO LyS 


Pro 


Asp Trp 


Glu Tyr Val 


Phe 


Lys 


Asn 


Ser Glu Phe 


Ser 






80 












90 


Ser Lys 


Gin 


Glu Thr Tyr Glu Glu 


Ser 


Ser 


Lys 


Val Val Thr 


Val 












100 






105 


Gly Ala 


Arg 


His Leu 


ber iyr oci 


Leu 


Asp 


Tvr 
■ l j a 


Pro Ser Leu 


Arg 






11U 






-1 — L — J 






120 


Glu Asp 


Cys 


Gin Ser 


Glu Asp Trp 


A JT 1 


Lys 


Asn 


Gin Leu Glv 


Ser 






IOC 






130 






135 


Gin Glu 


val 


His Leu 


aer bin Leu 


lie 


He 


Thr 


His Lvs Glu 


He 






1 A A 
14U 












150 


Leu. Pro 


pi,, 

blU 


Val Gin 


Asn Lys Glu 


TVt 
1 j 1 


Asn 


Lys 


Ser Trp Gin 


Thr 






"ICC 






160 






165 


Pne His 


bin 


Asp Thr 


lie Phe Asp 


lie 


Gin 


Gin 


Ser Phe Pro 


Thr 






1 / U 






175 






180 


Lys Glu 


Lys 


Ala His 


Lys His Glu 


Pro 


Gin 


Lys 


Lys Ser Tyr 


Arg 






1 Q C 
lOD 






190 






195 


Lys Lys 


Ser 


Val Glu 


Met Lys His 


Arg 


Lys 


Val 


Tyr Val Glu 


Lvs 






zUU 












210 


Lys Leu 


Jjeu 


Lys Cys 


Asn Asp Cys 


Glu 


Lys 


Val 


Phe Asn Gin 


Ser 






215 






220 






225 


Ser* Ser 


lieu 


Thr Leu 


His Gin Arg 


lie 


His 


Thr Gly Glu Lys 


Pro 






230 






4* .j .j 






240 


Tyr Ala 


Cys 


Val Glu 


Cys Gly Lys 


Thr 


Phe 


Ser 


Gin Ser Ala 


Asn 






245 






250 






255 


Leu Ala 


Gin 


His Lys 


Arg lie His 


Thr 


Gly 


Glu 


Lys Pro Tyr 


Glu 






260 






265 






270 


Cys Lys 


Glu 


Cys Arg 


Lys Ala Phe 


Ser 


Gin 


Asn 


Ala His Leu 


Ala 






275 






280 






285 


Gin His 


Gin 


Arg Val 


His Thr Gly Glu 


Lys 


Pro 


Tyr Gin Cys 


Lys 






290 






ZJ -J 






300 


Glu Cys 


Lys 


Lys Ala 


Phe Ser Gin 


lie 


Ala 


His 


Leu Thr Gin 


His 






305 






310 






315 


Gin Arg 


Val 


His Thr 


Gly Glu Arg 


Pro 


Phe 


Glu 


Cys He Glu 


Cys 






320 






325 






330 


Gly Lys 


Ala 


Phe Ser 


Asn Gly Ser 


Phe 


Leu 


Ala 


Gin His Gin 


Arg 






335 






340 






345 


lie His 


Thr Gly Glu 


Lys Pro Tyr Val 


Cys 


Asn 


Val Cys Gly 


Lys 






350 






355 






360 


Ala Phe 


Ser 


His Arg 


Gly Tyr Leu 


He 


Val 


His 


Gin Arg He 


His 






365 






370 






375 


Thr Gly Glu 


Arg Pro 


Tyr Glu Cys 


Lys 


Glu 


Cys 


Arg Lys Ala 


Phe 






380 






385 






390 


Ser Gin 


Tyr 


Ala His 


Leu Ala Gin 


His 


Gin 


Arg 


Val His Thr 


Gly 






395 






400 






405 


Glu Lys 


Pro 


Tyr Glu 


Cys Lys Val 


Cys 


Arg 


Lys 


Ala Phe Ser 


Gin 






410 






415 






420 


He Ala 


Tyr 


Leu Asp 


Gin His Gin 


Arg 


Val 


His 


Thr Gly Glu 


Lys 



50/101 



BNSDOCID: <WO 03000864A2_I_> 



WO 03/000864 



PCT/US02/21179 



425 



430 



435 



rXO 


Tyr 




Cys 


lie 


ulu 


^— to 


Gly 


Lys 


Ala 


Phe 


Ser 


Asn 


Ser 


Ser 




















445 










450 


Ser 


Leu 


A J- a. 


will 


nib 


nl ti 
oj.li 


nig 


Ser 


His 


Thr Gly 


Glu 


Lys 


Pro 


Tvr 










u j 










460 










465 




Cys 


Lys 


VJJL u. 


Cys 


Arg 


Lys 


Thx 


Phe 


Ser 


Gin 


Asn 


Ala 


Glv 


Leu 










470 










475 










480 


Ala 




111 s 




Arg 


He 


His 


Thr Gly 


Glu 


Lys 


Pro 




Glu 


Cys 










/IOC 










490 










495 


Asn 


Val 


Cys 




Lys 


Ala 


Phe 


Ser 


Tyr 


Ser Gly 


Ser 


Leu 


Thr 


Leu 




















505 










510 


His 


Gin 


Arg 


He 


His 


Thr Gly 


Glu 


Arg 


Pro 


Tyr 


Glu 


Cys 


Lys 


Asp 










515 










520 










525 


Cys 


Arg 


Lys 


Ser 


Phe 


Arg 


Gin 


Arg 


Ala 


His 


Leu 


Ala 


His 


His 


Glu 










530 










535 










540 


Arg 


He 


His 


Thr 


Met 


Glu 


Ser 


Phe 


Leu 


Thr 


Leu 


Ser 


Ser 


Pro 


Ser 










545 










550 










555 


Pro 


Ser 


Thr 


Ser 


Asn 


Gin 


Leu 


Pro 


Arg 


Pro 


Val 


Gly 


Phe 


He 


Ser 



<210> 30 

<211> 1712 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> mi sc_ feature 

<223> Incyte ID No: 4304550CD1 
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Met Glu Arg Asn Val Leu Thr Thr Phe Ser Gin Glu Met Ser Gin 
15 10 15 

Leu He Leu Asn Glu Met Pro Lys Ala Glu Tyr Ser Ser Leu Phe 
20 25 30 

Asn Asp Phe Val Glu Ser Glu Phe Phe Leu He Asp Gly Asp Ser 
35 40 45 

Leu Leu He Thr Cys He Cys Glu He Ser Phe Lys Pro Gly Gin 
50 55 60 

Asn Leu His Phe Phe Tyr Leu Val Glu Arg Tyr Leu Val Asp Leu 
65 70 75 

He Ser Lys Gly Gly Gin Phe Thr He Val Phe Phe Lys Asp Ala 
80 85 90 

Glu Tyr Ala Tyr Phe Asn Phe Pro Glu Leu Leu Ser Leu Arg Thr 
95 100 105 

Ala Leu He Leu His Leu Gin Lys Asn Thr Thr He Asp Val Arg 
110 115 120 

Thr Thr Phe Ser Arg Cys Leu Ser Lys Glu Trp Gly Ser Phe Leu 
125 130 135 

Glu Glu Ser Tyr Pro Tyr Phe Leu He Val Ala Asp Glu Gly Leu 
140 145 150 

Asn Asp Leu Gin Thr Gin Leu Phe Asn Phe Leu He He His Ser 
155 160 165 

Trp Ala Arg Lys Val Asn Val Val Leu Ser Ser Gly Gin Glu Ser 
170 175 180 

Asp Val Leu Cys Leu Tyr Ala Tyr Leu Leu Pro Ser Met Tyr Arg 



560 



565 



570 
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XO J 










190 








195 


His 


Gin 


He 


Phe 


Ser 
200 


Trp 


Lys 


Asn 


Lys 


Gin 
205 


Asn 


He 


Lys 


Asp Ala 
210 


Tyr 


Thr 


Thr 


Leu 


Leu 
215 


Asn 


Gin 


Leu 


Glu 


Arg 
220 


Phe 


Lys 


Leu 


Ser Ala 
225 


Leu 


Ala 


Pro 


Leu 


Phe 
230 


Glv 


Ser 


Leu 


Lys 


Trp 

235 


Asn 


Asn 


He 


Thr Glu 
240 


Glu 


Ala 


His 


Lys 


Thr 
245 


Val 


Ser 


Leu 


Leu 


Thr 
250 


Gin 


Val 


Trp 


Pro Glu 
255 


Gly 


Ser 


Asp 


He 


Arg 

<&• D KJ 


Arg 


Val 


Phe 


Cys 


Val 
265 


Thr 


Ser 


Cys 


Ser Leu 
270 


Ser 


Leu 


Arg 


Met 


Tyr 


His 


Arg Phe Leu Gly Asn Arg Glu 


Pro Ser 










97 5 










9R0 

£k O \J 








285 


Ser 


Gly 


Gin 


Glu 


Thr 
9 q n 


Glu 


lie 


rjl r-t 

1*7 J. 11 


al n 


v ax 
9 Q R 

jL> -7 J 


Asn 


Ser 


Asn 


Cys Leu 
300 


Thr 


Leu 


Gin 


Glu 


Met 
70^ 

S> \J ~J 


Glu 


Asp 


Leu 


Cys 


Lys 
71 n 


Leu 


His 


Cys 


Leu Thr 
315 


Val 


Val 


Phe 


Leu 


Leu 

7 90 

J Z. \J 


His 


Leu 


Pro 


Leu 


Ser 

79 5 
o _> 


Gin 


Arg 


Ala 


Cys Ala 
330 


Arg 


Val 


He 


Thr 


Ser 

77 5 
j j -j 


His 


Trp 


nla 


czl ii 


Asp 
7 AO 

O *± \J 


Met 


Lys 


Pro 


Leu Leu 
345 


Gin 


Met 


Lys 


Lys 


Trp 
750 


Cys 


Glu 


Tyr 


Phe 


He 

O -J -J 


Leu 


Arg 


Asn 


He His 
360 


Thr 


Phe 


Glu 


Phe 


Trp 
7fi5 

O u ~J 


Asn 


Leu 


Asn 


Leu 


He 

770 


His 


Leu 


Ser 


Asp Leu 
375 


Asn 


Asp 


ul LI 


Leu 


Leu 

J O VJ 


Leu 


Lys 


Asn 


He 


Ala 

7 R5 
s> O J 


Phe 


Tyr 


Tyr 


Glu Asn 
390 


Glu 


Asn 


Val 


Lys 


Gly 

~> J ~J 


Leu 


His 


Leu 


Asn 


Leu 
AOO 


Gly 


Asp 


Thr 


He Met 
405 


Lys 


Asp 


Tyr 


Glu 


Tyr 

Al 0 


Leu 


Trp 


Asn 


X IlX 


Tip 

Al 5 


Ser 


Lys 


Leu 


Val Arg 
420 


Asp 




nl n 


Val 


al -v/ 


Gin 


Pro 


Phe 


Pro 


Leu 


Arg 


Thr Thr Lys Val 










A9 5 










A70 

1 J u 








435 


Cys 


Phe 


Leu 


Glu 


Lys 

AAO 


Lys 


Pro 


Ser 


Pro 


He 
AA5 


Lys 


Asp 


Ser 


Ser Asn 
450 


Glu 


Met 


Val 


Pro 


Asn 
*± ~j ~j 


Leu 


Gly 


Phe 


He 


Pro 

AGO 


Thr 


Ser 


Ser 


Phe Val 
465 


Val 


Asp 


Lys 


Phe 


Ala 


Gly Asp 


He 


Leu 


Lvs 


Asp 


Leu 


Pro 


Phe Leu 










A 7 0 










475 








480 


Lys 


Ser 


Asp 


Asp 


Pro 
485 


He 


Val 


Thr 


Ser 


Leu 
490 


Val 


Lys 


Gin 


Lys Glu 
495 


xr lie 


Asp 


Glu 


Leu 


Val 
500 


His 


Trp 


His 


Ser 


His 
505 


Lys 


Pro 


Leu 


Ser Asp 
510 


Asp 


Tyr 


Asp 


Arg 


Ser 

51 5 


Arg 


Cys 


Gin 


Phe 


Asp 
520 


Glu 


Lys 


Ser 


Arg Asp 
525 


Pro 


Arg 


Val 


Leu 


Arg 

570 


Ser 


Val 


Gin 


Lys 


Tyr 
535 


His 


Val 


Phe 


Gin Arg 
540 


Phe 


Tyr Gly Asn 


Ser 


Leu 


Glu 


Thr 


Val 


Ser 


Ser 


Lys 


He 


He Val 










CAR 










550 








555 


Thr 


Gin 


Thr 


He 


Lvs 
560 


Ser 


Lys 


Lys 


Asp 


Phe 
565 


Ser 


Gly 


Pro 


Lys Ser 
570 


Lys 


Lys 


Ala 


His 


Glu 
575 


Thr 


Lys 


Ala 


Glu 


He 
580 


He 


Ala 


Arg 


Glu Asn 
585 


Lys 


Lys 


Arg 


Leu 


Phe 
590 


Ala 


Arg 


Glu 


Glu 


Gin 
595 


Lys 


Glu 


Glu 


Gin Lys 
600 


Trp 


Asn 


Ala 


Leu 


Ser 


Phe 


Ser 


He 


Glu 


Glu 


Gin 


Leu 


Lys 


Glu Asn 
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605 610 615 

Leu His Ser Gly He Lys Ser Leu Glu Asp Phe Leu Lys Ser Cys 
620 625 630 

Lys Ser Ser Cys Val Lys Leu Gin Val Glu Met Val Gly Leu Thr 
635 640 645 

Ala Cys Leu Lys Ala Trp Lys Glu His Cys Arg Ser Glu Glu Gly 
650 655 660 

Lys Thr Thr Lys Asp Leu Ser He Ala Val Gin Val Met Lys Arg 
665 670 675 

He His Ser Leu Met Glu Lys Tyr Ser Glu Leu Leu Gin Glu Asp 
680 685 690 

Asp Arg Gin Leu He Ala Arg Cys Leu Lys Tyr Leu Gly Phe Asp 
695 700 705 

Glu Leu Ala Ser Ser Leu His Pro Ala Gin Asp Ala Glu Asn Asp 
710 715 720 

Val Lys Val Lys Lys Arg Asn Lys Tyr Ser Val Gly He Gly Pro 
725 730 735 

Ala Arg Phe Gin Leu Gin Tyr Met Gly His Tyr Leu He Arg Asp 
740 745 750 

Glu Arg Lys Asp Pro Asp Pro Arg Val Gin Asp Phe He Pro Asp 
755 760 765 

Thr Trp Gin Arg Glu Leu Leu Asp Val Val Asp Lys Asn Glu Ser 
770 775 780 

Ala Val He Val Ala Pro Thr Ser Ser Gly Lys Thr Tyr Ala Ser 
785 790 795 

Tyr Tyr Cys Met Glu Lys Val Leu Lys Glu Ser Asp Asp Gly Val 
800 805 810 

Val Val Tyr Val Ala Pro Thr Lys Ala Leu Val Asn Gin Val Ala 
815 820 825 

Ala Thr Val Gin Asn Arg Phe Thr Lys Asn Leu Pro Ser Gly Glu 
830 835 840 

Val Leu Cys Gly Val Phe Thr Arg Glu Tyr Arg His Asp Ala Leu 
845 850 855 

Asn Cys Gin Val Leu He Thr Val Pro Ala Cys Phe Glu He Leu 
860 865 870 

Leu Leu Ala Pro His Arg Gin Asn Trp Val Lys Lys He Arg Tyr 
875 880 885 

Val He Phe Asp Glu Val His Cys Leu Gly Gly Glu He Gly Ala 
890 895 900 

Glu He Trp Glu His Leu Leu Val Met He Arg Cys Pro Phe Leu 
905 910 915 

Ala Leu Ser Ala Thr He Ser Asn Pro Glu His Leu Thr Glu Trp 
920 925 930 

Leu Gin Ser Val Lys Trp Tyr Trp Lys Gin Glu Asp Lys He He 
935 940 945 

Glu Asn Asn Thr Ala Ser Lys Arg His Val Gly Arg Gin Ala Gly 
950 955 960 

Phe Pro Lys Asp Tyr Leu Gin Val Lys Gin Ser Tyr Lys Val Arg 
965 970 975 

Leu Val Leu Tyr Gly Glu Arg Tyr Asn Asp Leu Glu Lys His Val 
980 985 990 

Cys Ser He Lys His Gly Asp He His Phe Asp His Phe His Pro 
995 1000 1005 

Cys Ala Ala Leu Thr Thr Asp His He Glu Arg Tyr Gly Phe Pro 
1010 1015 1020 

Pro Asp Leu Thr Leu Ser Pro Arg Glu Ser He Gin Leu Tyr Asp 
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1025 1030 1035 

Ala Met Phe Gin lie Trp Lys Ser Trp Pro Arg Ala Gin Glu Leu 

1040 1045 1050 

Cys Pro Glu Asn Phe lie His Phe Asn Asn Lys Leu Val lie Lys 

1055 1060 1065 

Lys Met Asp Ala Arg Lys Tyr Glu Glu Ser Leu Lys Ala Glu Leu 
.1070 1075 1080 

Thr Ser Trp lie Lys Asn Gly Asn Val Glu Gin Ala Arg Met Val 

1085 1090 1095 

Leu Gin Asn Leu Ser Pro Glu Ala Asp Leu Ser Pro Glu Asn Met 

1100 1105 1110 

lie Thr Met Phe Pro Leu Leu Val Glu Lys Leu Arg Lys Met Glu 

1115 1120 1125 

Lys Leu Pro Ala Leu Phe Phe Leu Phe Lys Leu Gly Ala Val Glu 

1130 1135 1140 

Asn Ala Ala Glu Ser Val Ser Thr Phe Leu Lys Lys Lys Gin Glu 

1145 1150 1155 

Thr Lys Arg Pro Pro Lys Ala Asp Lys Glu Ala His Val Met Ala 

1160 1165 1170 

Asn Lys Leu Arg Lys Val Lys Lys Ser lie Glu Lys Gin Lys lie 

1175 1180 1185 

He Asp Glu Lys Ser Gin Lys Lys Thr Arg Asn Val Asp Gin Ser 

1190 1195 1200 

Leu He His Glu Ala Glu His Asp Asn Leu Val Lys Cys Leu Glu 

1205 1210 1215 

Lys Asn Leu Glu lie Pro Gin Asp Cys Thr Tyr Ala Asp Gin Lys 

1220 1225 1230 

Ala Val Asp Thr Glu Thr Leu Gin Arg Val Phe Gly Arg Val Lys 

1235 1240 1245 

Phe Glu Arg Lys Gly Glu Glu Leu Lys Ala Leu Ala Glu Arg Gly 

1250 1255 1260 

lie Gly Tyr His His Ser Ala Met Ser Phe Lys Glu Lys Gin Leu 

1265 1270 1275 

Val Glu He Leu Phe Arg Lys Gly Tyr Leu Arg Val Val Thr Ala 

1280 1285 1290 

Thr Gly Thr Leu Ala Leu Gly Val Asn Met Pro Cys Lys Ser Val 

1295 1300 1305 

Val Phe Ala Gin Asn Ser Val Tyr Leu Asp Ala Leu Asn Tyr Arg 

1310 1315 1320 

Gin Met Ser Gly Arg Ala Gly Arg Arg Gly Gin Asp Leu Met Gly 

1325 1330 1335 

Asp Val Tyr Phe Phe Asp He Pro Phe Pro Lys He Gly Lys Leu 

1340 1345 1350 

He Lys Ser Asn Val Pro Glu Leu Arg Gly His Phe Pro Leu Ser 

1355 1360 1365 

He Thr Leu Val Leu Arg Leu Met Leu Leu Ala Ser Lys Gly Asp 

1370 1375 1380 

Asp Pro Glu Asp Ala Lys Ala Lys Val Leu Ser Val Leu Lys His 

1385 1390 1395 

Ser Leu Leu Ser Phe Lys Gin Pro Arg Val Met Asp Met Leu Lys 

1400 1405 1410 

Leu Tyr Phe Leu Phe Ser Leu Gin Phe Leu Val Lys Glu Gly Tyr 

1415 1420 1425 

Leu Asp Gin Glu Gly Asn Pro Met Gly Phe Ala Gly Leu Val Ser 

1430 1435 1440 

His Leu His Tyr His Glu Pro Ser Asn Leu Val Phe Val Ser Phe 
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1 44- S 

JL ^± "± «J 








1450 








1455 


Leu 


Val 


Asn 




Phe 


His 


Asp 


Leu Cys 


Gin 


Pro 


Thr 


Arg Lys 








1460 








1465 








1470 


Gly Ser Lys 




Ser 


Gin 


Asp 


Val Met 


Glu 


Lvs 


Leu 


Val Leu 








1 47 5 
x*± / j 








1480 








1485 


val 


Leu 


Axa 




Phe Gly Arg 


Arcj Tyr 


Phe 


Pro 


Pro 


Lys Phe 








1490 








1495 








1500 




Asp 


AX a 




nl ii 


"PViea 
jrllfci 


xyr 


Gin Ser 


Lvs 


Val 


Phe 


Leu Asp 






1505 








1510 








1515 


Asp 


Jj eu 


Pro 




Phe 


Ser 


Asp 


Ala Leu 


Asp 


Glu 


Tyr 


Asn Met 








1 ROD 








1525 








1530 


Lys 


lie 


Met 




Phe 


Thr 


Thr 


Phe Leu 


Arg 


He 


Val 


Ser Lys 
















1540 








1545 


Leu 


Ala 


Asp 


1*1 C! L. A&Xl 


Gin 


Glu 


Tyr 


Gin Leu 


Pro 


Leu 


Ser 


Lys lie 
















1555 








1560 


Lys 


Phe 


Thr 


Gly Lys 


Glu 


Cys 


Glu 


Asp Ser 


Gin 


Leu 


Val 


Ser His 








1565 








1570 








1575 


Leu 


Met 


Ser 


Cys Lys 


Glu 


Gly 


Arg 


Val Ala 


lie 


Ser 


Pro 


Phe Val 








1580 








1585 








1590 


Cys 


Leu 


Ser 


Gly Asn 


Phe 


Asp 


Asp 


Asp Leu 


Leu 


Arcr 


Leu 


Glu Thr 








1595 








1600 








1605 


Pro Asn His Val Thr Leu Gly Thr lie Gly Val 


Asn 


Arg 


Ser Gin 








1610 








1615 








1620 


Ala 


Pro 


Val 


Leu Leu 


Ser 


Gin 


Lys 


Phe Asp 


Asn 


Arg 


Gly Arg Lys 








1625 








1630 








1635 


Met 


Ser 


Leu 


Asn Ala 


Tyr 


Ala 


Leu 


Asp Phe 


Tyr 


Lys 


His 


Gly Ser 








1640 








1645 








1650 


Leu 


lie 


Gly 


Leu Val 


Gin 


Asp 


Asn 


Arg Met 


Asn Glu Gly Asp Ala 








1655 








1660 








1665 


Tyr 


Tyr 


Leu 


Leu Lys 


Asp 


Phe 


Ala 


Leu Thr 


He 


Lys 


Ser 


He Ser 








1670 








1675 








1680 


Val 


Ser 


Leu 


Arg Glu 


Leu 


Cys 


Glu 


Asn Glu 


Asp 


Asp 


Asn 


Val Val 








1685 








1690 








1695 


Leu 


Ala 


Phe 


Glu Gin 


Leu 


Ser 


Thr 


Thr Phe 


Trp 


Glu 


Lys 


Leu Asn 








1700 








1705 








1710 



Lys Val 
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Met Ser Arg Phe 


Pro Ala Val 


Ala Gly 


Arg 


Ala 


Pro 


Arg 


Arg 


Gin 


1 


5 




10 










15 


Glu Glu Gly Glu 


Arg Pro He 


Glu Leu 


Gin 


Glu 


Glu 


Arg 


Pro 


Ser 




20 




25 










30 


Ala Val Arg lie 


Ala Asp Arg 


Glu Glu 


Lys 


Gly 


Cys 


Thr 


Ser 


Gin 




35 




40 










45 


Glu Gly Gly Thr Thr Pro Thr Phe Pro 


He 


Gin 


Lys 


Gin 


Arg 


Lys 




50 




55 










60 
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Lys 


Leu 


He 


Gin Ala 


Val 


Arg 


Asp 


Asn 


Ser 




Leu 


He Val Thr 








65 










70 






75 


Gly Asn 


Thr Gly Ser Gly Lys Thr Thr Gin 


Leu 


Pro 


Lys Tyr Leu 








on 
o u 










85 






90 


Tyr 


Glu 


Ala 


Gly Phe 


Ser 


Gin 


His 


Gly Met 


lie 


Gly Val Thr. Gin 


















100 






105 


Pro 


Arg 


Lys 


Val Ala 


Ala 


He 


Ser 


Val 


Ala 


Gin 


Arg 


val Ala ulU 








11U 










115 






120 


Glu 


Met 


Lys 


Cys Thr 


Leu 


Gly Ser Lys 


Val 


Gly 


Tyr 


Gin Val Arg 








125 










130 






135 


Phe 


Asp 


Asp 


Cys Ser 


Ser 


Lys 


Glu 


Thr 


Ala 


He 


Lys 


Tyr Met Thr 








140 










145 






150 


Asp Gly 


Cys 


Leu Leu 


Lys 


His 


He 


Leu 


Gly Asp 


Pro 


Asn Leu Thr 








ICC 

lbb 










160 






165 


Lys 


Phe 


Ser 


Val He 


He 


Leu 


Asp 


Glu 


Ala 


His 


Glu 


Arg Thr Leu 








170 










175 






180 


Thr 


Thr 


Asp 


lie Leu 


Phe 


Gly 


Leu 


Leu 


Lys 


Lys 


Leu 


Phe Gin Glu 








1 o c 

18b 










190 






195 


Lys 


Ser 


Pro 


Asn Arg 


Lys 


Glu 


His 


Leu 


Thr 


Ser 


Gly Gly Thr Cys 








200 










205 








His 


Ala 


Thr 


Met Glu 


Leu 


Ala 


Lys 


Leu 


Ser 


Ala 


Pne 


Phe Gly Asn 








one 

215 










220 






o o 


Cys 


Pro 


lie 


Phe Asp 


lie 


Pro 


Gly 


Arg 


Leu 


Tyr 


Pro 


val Arg Glu 








230 










235 






o a n 


Lys 


Phe 


Cys 


Asn Leu 


He 


Gly 


Pro 


Arg 


Asp 


Arg 


Glu 


Asn Thr Ala 








245 










250 






Ado 


Tyr 


He 


Gin 


Ala He 


Val 


Lys 


Val 


Thr 


Met 


Asp 


Tl - 

lie 


His Leu Asn 








260 










265 






Z / U 


Glu 


Met 


Ala 


Gly Asp 


He 


Leu 


Val 


Phe 


Leu 


Thr 


Gly 


Gin Pne Glu 








275 










280 






285 


He 


Glu 


Lys 


Ser Cys 


Glu 


Leu 


Leu 


Phe 


Gin 


Met 


Ala 


Glu Ser Val 








290 










295 






300 


Asp Tyr Asp Tyr Asp Val 


Gin 


Asp 


Thr 


Thr 


Leu 


Asp 


Gly Leu Leu 








305 










310 






315 


He 


Leu 


Pro 


Cys Tyr 


Gly 


Ser 


Met 


Thr 


Thr 


Asp 


Gin 


Gin Arg Arg 








320 










325 






330 


He 


Phe 


Leu 


Pro Pro 


Pro 


Pro 


Gly 


lie 


Arg 


Lys 


Cys 


Val He Ser 








335 










340 






345 


Thr 


Asn 


He 


Ser Ala 


Thr 


Ser 


Leu 


Thr 


He 


Asp 


Gly 


He Arg Tyr 








350 










355 






360 


Val 


Val 


Asp 


Gly Gly Phe Val 


Lys 


Gin 


Leu 


Asn 


His 


Asn Pro Arg 








365 










370 






375 


Leu 


Gly 


Leu 


Asp He 


Leu 


Glu 


Val 


Val 


Pro 


He 


Ser 


Lys Ser Glu 








380 










385 






390 


Ala 


Leu 


Gin 


Arg Ser Gly Arg 


Ala 


Gly 


Arg 


Thr 


Ser 


Ser Gly Lys 








395 










400 






405 


Cys 


Phe 


Arg 


He Tyr 


Ser 


Lys 


Asp 


Phe 


Trp 


Asn 


Gin 


Cys Met Pro 








410 










415 






420 


Asp 


His 


Val 


He Pro 


Glu 


He 


Lys 


Arg 


Thr 


Ser 


Leu 


Thr Ser Val 








425 










430 






435 


Val 


Leu 


Thr 


Leu Lys 


Cys 


Leu 


Ala 


lie 


His 


Asp 


Val 


He Arg Phe 








440 










445 






450 


Pro 


Tyr 


Leu 


Asp Pro 


Pro 


Asn 


Glu 


Arg 


Leu 


He 


Leu 


Glu Ala Leu 








455 










460 






465 


Lys 


Gin 


Leu Tyr Gin 


Cys 


Asp 


Ala 


He 


Asp 


Arg 


Ser Gly His Val 








470 










475 






480 
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Thr 


Arg 


Leu Gly 


Leu 


Ser 


Met 


Val 


Glu 


Phe 


Pro 


Leu Pro 


Pro 


His 










485 










490 








495 


Leu 


Thr Cys 


Ala 


Val 


He 


Lys 


Ala 


Ala 


Ser 


Leu 


Asp Cys 


Glu 


Asp 










500 










505 








510 


Leu 


Leu 


Leu 


Pro 


He 


Ala 


Ala 


Met 


Leu 


Ser 


Val 


Glu Asn 


Val 


Phe 










515 










520 








525 


lie 


Arg 


Pro 


Val 


Asp 


Pro 


Glu 


Tyr 


Gin 


Lys 


Glu 


Ala Glu 


Gin 


Arg 










530 










535 








540 


His 


Arg 


Glu 


Leu 


Ala 


Ala 


Lys 


Ala Gly Gly 


Phe 


Asn Asp 


Phe 


Ala 










545 










550 








555 


Thr 


Leu 


Ala 


Val 


He 


Phe 


Glu 


Gin 


Cys 


Lvs 


Ser 


Ser Gly 


Ala 


Pro 










560 










565 








570 


Ala 


S er 


Ttt) 
ii f 


Cys 


Gin 


Lys 


His 


Trp 


He 


His 


Trp 


Arg Cys 


Leu 


Phe 










575 










580 








585 


Ser 


Ala 


Phe 


Arg 


Val 


Glu 


Ala 


Gin 


Leu 


Arcr 


Glu 


Leu He 


Arg 


Lys 










590 










595 








600 


Leu 


Lys 


Gin 


Gin 


Ser 


Asp 


Phe 


Pro 


Lys 


Glu 


Thr 


Phe Glu Gly 


Pro 








605 










610 








615 


Lyg 


His 


Glu 


Val 


Leu 


Arg 


Arg 


Cys 


Leu 


Cvs 


Ala 


Gly Tyr 


Phe 


Lys 








620 










625 








630 


Asn 


Val 


Ala 




Arg 


Ser 


Val 


Gly Arg 


Thr 


Phe 


Cys Thr 


Met 


Asp 










635 










640 








645 


Glv 




Glv 


Ser 


Pro 


Val 


His 


He 


His- 


Pro 


Ser 


Ser Ala 


Leu 


His 








650 










655 








660 


Glu 


Gin 


Glu 


Thr 


Lys 


Leu 


Glu 


Trp 


He 


He 


Phe 


His Glu 


Val 


Leu 










665 










670 








675 


Val 


Thr 


Thr 


Lys 


Val 


Tyr 


Ala 


Arg 


He 


Val 


Cys 


Pro He 


Arg 


Tyr 










680 










685 








690 


Glu 


T 1 7-7-1 


Val 


Arg 


Asp 


Leu 


Leu 


Pro 


Lys 


Leu 


His 


Glu Leu 


Asn 


Ala 










695 










700 








705 


His 


Asp 


Leu 


Ser 


Ser 


Val 


Ala 


Arg 


Arg 


Glu 


Met 


Arg Glu 


Asp 


Ala 










710 










715 








720 


Arg 


Arg 


Lys 


Trp 


Thr 


Asn 


Lys 


Glu 


Asn 


Val 


Lys 


Gin Leu 


Lys 


Asp 










725 










730 








735 


Gly 


lie 


Ser 


Lys 


Glu 


Val 


Leu 


Lys 


Lys 


Met 


Gin 


Arg Arg 


Asn 


Asp 










740 










745 








750 


Asp 


Lys 


Ser 


He 


Ser 


Asp 


Ala 


Arg 


Ala 


Arg 


Phe 


Leu Glu 


Arg 


Lys 










755 










760 








765 


Gin 


Gin 


Arg 


He 


Gin 


Asp 


His 


Ser Asp 


Thr 


Leu 


Lys Glu 


Thr 


Gly 










770 










775 








780 



<210> 32 

<211> 648 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 4447743CD1 

<400> 32 

Met Glu Leu Val Thr Phe Arg Asp Val Ala He Glu Phe Ser Pro 

15 10 15 

Glu Glu Trp Lys Cys Leu Asp Pro Ala Gin Gin Asn Leu Tyr Arg 

20 25 30 
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Asp Val Met Leu Glu Asn Tyr Arg Asn Leu Val Ser Leu Gly Phe 
35 40 45 

Val He Ser Asn Pro Asp Leu Val Thr Cys Leu Glu Gin He Lys 
50 55 60 

Glu Pro Cys Asn Leu Lys He His Glu Thr Ala Ala Lys Pro Pro 
65 70 75 

Ala He Cys Ser Pro Phe Ser Gin Asp Leu Ser Pro Val Gin Gly 
80 85 90 

He Glu Asp Ser Phe His Lys Leu He Leu Lys Arg Tyr Glu Lys 
95 100 105 

Cys Gly His Glu Asn Leu Gin Leu Arg Lys Gly Cys Lys Arg Val 
110 115 120 

Asn Glu Cys Lys Val Gin Lys Gly Val Asn Asn Gly Val Tyr Gin 
125 130 135 

Cys Leu Ser Thr Thr Gin Ser Lys He Phe Gin Cys Asn Thr Cys 
140 145 150 

Val Lys Val Phe Ser Lys Phe Ser Asn Ser Asn Lys His Lys He 
155 160 165 

Arg His Thr Gly Glu Lys Pro Phe Lys Cys Thr Glu Cys Gly Arg 
170 175 180 

Ser Phe Tyr Met Ser His Leu Thr Gin His Thr Gly He His Ala 
185 190 195 

Gly Glu Lys Pro Tyr Lys Cys Glu Lys Cys Gly Lys Ala Phe Asn 
200 205 210 

Arg Ser Thr Ser Leu Ser Lys His Lys Arg He His Thr Gly Glu 
215 220 225 

Lys Pro Tyr Thr Cys Glu Glu Cys Gly. Lys Ala Phe Arg Arg Ser 
230 235 240 

Thr Val Leu Asn Glu His Lys Lys He His Thr Gly Glu Lys Pro 
245 250 255 

Tyr Lys Cys Glu Glu Cys Gly Lys Ala Phe Thr Arg Ser Thr Thr 
260 265 270 

Leu Asn Glu His Lys Lys He His Thr Gly Glu Lys Pro Tyr Lys 
275 280 285 

Cys Lys Glu Cys Gly Lys Ala Phe Arg Trp Ser Thr Ser Leu Asn 
290 295 300 

Glu His Lys Asn He His Thr Gly Glu Lys Pro Tyr Lys Cys Lys 
305 310 315 

Glu Cys Gly Lys Ala Phe Arg Gin Ser Arg Ser Leu Asn Glu His 
320 325 330 

Lys Asn He His Thr Gly Glu Lys Pro Tyr Thr Cys Glu Lys Cys 
335 340 345 

Gly Lys Ala Phe Asn Gin Ser Ser Ser Leu He He His Arg Ser 
350 355 360 

He His Ser Glu Gin Lys Leu Tyr Lys Cys Glu Glu Cys Gly Lys 
365 370 375 

Ala Phe Thr Trp Ser Ser Ser Leu Asn Lys His Lys Arg He His 
380 385 390 

Thr Gly Glu Lys Pro Tyr Thr Cys Glu Glu Cys Gly Lys Ala Phe 
395 400 405 

Tyr Arg Ser Ser His Leu Ala Lys His Lys Arg He His Thr Gly 
410 415 420 

Glu Lys Pro Tyr Thr Cys Glu Glu Cys Gly Lys Ala Phe Asn Gin 
425 430 435 

Ser Ser Thr Leu He Leu His Lys Arg He His Ser Gly Gin Lys 
440 445 450 
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Pro 


Tvr 


Lys 


Cvs 


Glu 


Glu 


Cys 


Gly 


Lys 


Ala Phe 


Thr Arg 


Ser 


Thr 










455 










460 








465 


Thr 


Leu 


Asn 


Glu 


His 
470 


Lys 


Lys 


lie 


His 


Thr Gly 
475 


Glu 


Lys 


Pro 


Tyr 
480 


Lys 


Cys 


Glu 


Glu 


Cys 
485 


Gly 


Lys 


Ala 


Phe 


lie Trp 
490 


Ser 


Ala 


Ser 


Leu 
495 


Asn 


Glu 


His 


Lys 


Asn 


lie 


His 


Thr 


Gly 


Glu Lys 


Pro 


Tyr Lys 


Cys 










500 










505 








510 


Lys 


Glu 


Cys 


Glv 


Lys 


Ala 


Phe 


Asn 


Gin 


Ser Ser 


Gly Leu 


He 


He 










515 










520 








525 


His 


Arg 


Ser 


lie 


His 
530 


Ser 


Glu 


Gin 


Lys 


Leu Tyr 
535 


Lys 


Cys 


Glu 


Glu 
540 


Qyg 


Gly 


Lys 


Ala 


Phe 
545 


Thr 


Aro 


Ser 


Thr 


Ala Leu 
550 


Asn 


Glu 


His 


Lys 
555 


Lys 


lie 


His 


Ser 


Gly Glu 


Lys 


Pro 


Tvr 


Lys Cys 


Lys 


Glu 


Cys 


Gly 










560 










565 








570 


Lys 


Ala 


Tyr 


Asn 


Leu 
575 


Ser 


Ser 


Thr 


Leu 


Thr Lys 
580 


His 


Lys 


Arg 


He 
585 


His 


Thr 


Gly Glu 


Lys 


Pro 


Phe 


Thr 


Cvs 


Glu Glu 


Cys 


Gly Lys 


Ala 










590 










595 








600 




Asn 


Trp 


Ser 


Ser 


Ser 


Leu 


Thr 


Lys 


His Lys 


He 


He 


His 


Thr 








605 










610 








615 


Gly 


Glu 


Lys 


Ser 


Tyr 
620 


Lys 


Cys 


Glu 


Glu 


Cys Gly 
625 


Lys 


Ala 


Phe 


Asn 
630 


Arg 


Pro 


Ser 


Thr 


Leu 


Thr 


Val 


His 


Lys 


Arg lie 


His 


Thr Gly 


Lys 










635 










640 








645 


Glu 


His 


Ser 

























<210> 33 

<211> 602 

<212> PRT 

<213> Homo sapiens 

<220> 

<22 1> misc__f eature 

<223> Incyte ID No: 7497554CD1 

<400> 33 

Met Ser Glu Arg Arg Arg Ser Ala Val Ala Leu Ser Ser Arg Ala 
15 10 15 

His Ala Phe Ser Val Glu Ala Leu He Gly Ser Asn Lys Lys Arg 

20 25 30 

Lys Leu Arg Asp Trp Glu Glu Lys Gly Leu Asp Leu Ser Met Glu 

35 40 45 

Ala Leu Ser Pro Ala Gly Pro Leu Gly Asp Thr Glu Asp Ala Ala 

50 55 60 

Ala His Gly Leu Glu Pro His Pro Asp Ser Glu Gin Ser Thr Gly 

65 70 75 

Ser Asp Ser Glu Val Leu Thr Glu Arg Thr Ser Cys Ser Phe Ser 

80 85 90 

Thr His Thr Asp Leu Ala Ser Gly Ala Ala Gly Pro Val Pro Ala 

95 100 105 

Ala Met Ser Ser Met Glu Glu He Gin Val Glu Leu Gin Cys Ala 
110 115 120 

Asp Leu Trp Lys Arg Phe His Asp He Gly Thr Glu Met He He 
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125 130 135 

Thr Lys Ala Gly Arg Arg Met Phe Pro Ala Met Arg Val Lys lie 

140 145 150 

Thr Gly Leu Asp Pro Asn Gin Gin Tyr Tyr lie Ala Met Asp lie 

155 160 165 

Val Pro Val Asp Asn Lys Arg Tyr Arg Tyr Val Tyr His Ser Ser 

170 175 180 

Lys Trp Met Val Ala Gly Asn Ala Asp Ser Pro Val Pro Pro Arg 

185 190 195 

Val Tyr lie His Pro Asp Ser Leu Ala Ser Gly Asp Thr Trp Met 

200 205 210 

Arg Gin Val Val Ser Phe Asp Lys Leu Lys Leu Thr Asn Asn Glu 

215 220 225 

Leu Asp Asp Gin Gly His lie lie Leu His Ser Met His Lys Tyr 

230 235 240 

Gin Pro Arg Val His Val lie Arg Lys Asp Phe Ser Ser Asp Leu 

245 250 255 

Ser Pro Thr Lys Pro Val Pro Val Gly Asp Gly Val Lys Thr Phe 

260 265 270 

Asn Phe Pro Glu Thr Val Phe Thr Thr Val Thr Ala Tyr Gin Asn 

275 280 285 

Gin Gin lie Thr Arg Leu Lys lie Asp Arg Asn Pro Phe Ala Lys 

290 295 300 

Gly Phe Arg Asp Ser Gly Arg Asn Arg Thr Gly Leu Glu Ala lie 

305 310 315 

Met Glu Thr Tyr Ala Phe Trp Arg Pro Pro Val Arg Thr Leu Thr 

320 325 330 

Phe Glu Asp Phe Thr Thr Met Gin Lys Gin Gin Gly Gly Ser Thr 

335 340 345 

Gly Thr Ser Pro Thr Thr Ser Ser Thr Gly Thr Pro Ser Pro Ser 

350 355 360 

Ala Ser Ser His Leu Leu Ser Pro Ser Cys Ser Pro Pro Thr Phe 

365 370 375 

His Leu Ala Pro Asn Thr Phe Asn Val Gly Cys Arg Glu Ser Gin 

380 385 390 

Leu Cys Asn Leu Asn Leu Ser Asp Tyr Pro Pro Cys Ala Arg Ser 

395 400 405 

Asn Met Ala Ala Leu Gin Ser Tyr Pro Gly Leu Ser Asp Ser Gly 

410 415 420 

Tyr Asn Arg Leu Gin Ser Gly Thr Thr Ser Ala Thr Gin Pro Ser 

425 430 435 

Glu Thr Phe Met Pro Gin Arg Thr Pro Ser Leu lie Ser Gly lie 

440 445 450 

Pro Thr Pro Pro Ser Leu Pro Gly Asn Ser Lys Met Glu Ala Tyr 

455 460 465 

Gly Gly Gin Leu Gly Ser Phe Pro Thr Ser Gin Phe Gin Tyr Val 

470 475 480 

Met Gin Ala Gly Asn Ala Ala Ser Ser Ser Ser Ser Pro His Met 

485 490 495 

Phe Gly Gly Ser His Met Gin Gin Ser Ser Tyr Asn Ala Phe Ser 

500 505 510 

Leu His Asn Pro Tyr Asn Leu Tyr Gly Tyr Asn Phe Pro Thr Ser 

515 520 525 

Pro Arg Leu Ala Ala Ser Pro Glu Lys Leu Ser Ala Ser Gin Ser 

530 535 540 

Thr Leu Leu Cys Ser Ser Pro Ser Asn Gly Ala Phe Gly Glu Arg 



60/101 



BNSDOCID: <WO 03000864A2J_> 



WO 03/000864 



PCT/US02/21179 



Gin Tyr Leu Pro 
Pro Ser Pro Asn 
Gin Tyr Gly Ala 
Met Val 



<210> 34 
<211> 388 
<212> PRT 
<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7475843CD1 

<400> 34 



Met 


Leu 


Glu 


Asn 




Arg 


Asn 


Leu 


Val 


Ser 


Leu 


Glv 


He 


Ala 


Val 


1 








5 










10 










15 


Ser 


Lys 


Pro 


Asp 


Leu 


He 


Thr 


Cys 


Leu 


Glu 


Gin 


Arg 


Asn 


Glu 


Pro 




















25 










30 


tr 


Asn 


Val 


Lys 


Lys 


His 


Glu 


Thr 


Val 


Ala 


m. y 


His 


Pro 


Ala 


Val 










35 




















45 


Ser 


Ser 


His 


Phe 


Thr 


Gin 


Asp 


Leu 


Leu 


Pro 


Glu 


His 


Glv 


He 


Lys 










50 










55 










60 


Asp 


Ser 


Phe 


Gin 




Val 


He 


Leu 


Arg 


Arg 


Tyr 


Gly 


Ser 


TVr 


Gly 










O D 










*7n 
/ u 










/ o 


lie 


Glu 


Asn 


Leu 


Gin 


Leu 


Lys 


Lys 


Asp 


Trp 


Glu 


Ser 


Val 


Gly 


Glu 










80 










85 










90 


Ser 


Lys Val 


Gin 


Lys 


Glu 


Cys 


Cys 


Asn 


Gly 


Leu 


Asn 


Gin 


Ser 


Leu 










95 










100 










105 


Ser 


Thr 


Thr 


His 


Thr 


Lys 


He 


Phe 


Gin 


Phe 


Asn 


Lys 


Cys 


Val 


Lys 










110 










115 










120 


Val 


Phe 


Ser 


Lys 


Ser 


Ser 


Asn 


Leu 


Asn 


Arg 


His 


Lys 


He 


Arg 


His 










125 










130 










135 


Thr 


Gly Glu 


He 


Ser 


Ser 


Asn 


Cys 


Lys 


Glu 


Cys 


Asp 


Asn 


Ser 


Phe 










140 










145 










150 


Tyr 


lie 


Ser 


Ser 


Val 


Leu 


Thr 


Pro 


Leu 


Gin 


Arg 


He 


His 


Thr 


Ala 










155 










160 










165 


Glu 


Lys 


Ser Tyr 


Lys 


Cys 


Lys 


Gin 


Cys 


Gly 


Lys 


Ala 


Phe 


Arg 


His 










170 










175 










180 


Cys 


Ser 


Cys 


Phe 


Leu 


Glu 


His 


Glu 


Thr 


He 


His 


Asn 


Glu 


Glu 


Lys 










185 










190 










195 


His 


Tyr Lys 


Cys 


Lys 


Glu Cys Gly Lys 


Val 


Phe 


Lys 


Ser 


Phe 


Thr 










200 










205 










210 


Ser 


Leu 


Ser 


Asn 


His 


He 


He 


He 


His 


Thr 


Gly 


Lys 


Lys 


Leu 


Tyr 










215 










220 










225 


Lys 


Cys 


Glu 


Glu 


Cys 


Gly Lys 


Ala 


Phe 


Asn 


His 


Ser 


Ser 


Asn 


His 










230 










235 










240 


Ala 


Lys 


His 


Lys 


Lys 


He His Thr Gly 


Gin 


Lys 


Pro 


His 


Lys 


Cys 










245 










250 










255 


Glu 


Glu 


Cys 


Gly 


Lys 


Ala 


Phe 


Asn 


Trp 


Phe 


Ser Tyr 


Leu 


Thr 


Leu 










260 










265 










270 
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545 550 555 

Ser Gly Met Glu His Ser Met His Met He Ser 

560 565 570 

Asn Gin Gin Ala Thr Asn Thr Cys Asp Gly Arg 

575 580 585 

Val Pro Gly Ser Ser Ser Gin Met Ser Val His 

590 595 600 
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His 


Lys 


Arg 


He 


His 


Thr 


Gly Glu Lys 


Pro 


Tyr 


Lys 


Cys 


Asp 


Glu 










275 










280 










285 


Cvs 

jf •=» 


Glv 


Lys 


Ala 


Phe 


Asn 


Gin 


Cys 


Ser 


Asn 


Leu Thr Lys 


His 


Lys 










290 










295 










300 


Arg 


lie 


His 


Thr 


Gly 
305 


Glu 


Lys 


Pro 


Tvr 


Lys 
310 


Cvs 


Glu 


Glu 


Cys 


Gly 
315 


Lys 


Ala 


Phe 


Asn 


Arg 
320 


Cys 


Ser 


His 


Leu 


Thr 
325 


Glu 


His 


Lys 


Arg 


He 
330 


His 


Thr 


Gly Glu 


Lys 


Pro 




Lys 




Glu 


Glu 


Cys 


Gly 


Lys 


Val 










j ~> j 










340 










345 


Phe 


He 


Ser 


Cys 


Ser 
350 


Ser 


Leu 


Ser 


Asn 


His 
355 


Lys 


Arg 


He 


His 


Thr 
360 


Arg 


Glu 


Lys 


Cys 


Tyr 
365 


Lys 


Ser 


Glu 


Glu 


Cys 
370 


Gly 


Lys 


Thr 


Phe 


Asn 
375 


His 


Cys 


Ser 


Asp 


Leu 
380 


Asn 


Val 


Pro 


Glu 


Lys 
385 


He 


His 


Thr 







<210> 35 

<211> 480 

<212> PRT 

<213> Homo sapiens 



<220> 

<221> mis cofeature 

<223> Incyte ID Nor 6319550CD1 



<400> 35 

Met Gly Ser Pro Ala Ala Pro Glu Gly Ala Leu Gly Tyr Val Arg 
15 10 15 

Glu Phe Thr Arg His Ser Ser Asp Val Leu Gly Asn Leu Asn Glu 
20 25 30 

Leu Arg Leu Arg Gly He Leu Thr Asp Val Thr Leu Leu Val Gly 
35 40 45 

Gly Gin Pro Leu Arg Ala His Lys Ala Val Leu He Ala Cys Ser 
50 55 60 

Gly Phe Phe Tyr Ser He Phe Arg Gly Arg Ala Gly Val Gly Val 
65 70 75 

Asp Val Leu Ser Leu Pro Gly Gly Pro Glu Ala Arg Gly Phe Ala 
80 85 90 

Pro Leu Leu Asp Phe Met Tyr Thr Ser Arg Leu Arg Leu Ser Pro 
95 100 105 

Ala Thr Ala Pro Ala Val Leu Ala Ala Ala Thr Tyr Leu Gin Met 
110 115 120 

Glu His Val Val Gin Ala Cys His Arg Phe He Gin Ala Ser Tyr 
125 130 135 

Glu Pro Leu Gly He Ser Leu Arg Pro Leu Glu Ala Glu Pro Pro 
140 145 150 

Thr Pro Pro Thr Ala Pro Pro Pro Gly Ser Pro Arg Arg Ser Glu 
155 160 165 

Gly His Pro Asp Pro Pro Thr Glu Ser Arg Ser Cys Ser Gin Gly 
170 175 180 

Pro Pro Ser Pro Ala Ser Pro Asp Pro Lys Ala Cys Asn Trp Lys 
185 190 195 

Lys Tyr Lys Tyr He Val Leu Asn Ser Gin Ala Ser Gin Ala Gly 
200 205 210 

Ser Leu Val Gly Glu Arg Ser Ser Gly Gin Pro Cys Pro Gin Ala 
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215 

Arg Leu Pro Ser Gly Asp Glu Ala Ser Ser Ser Ser Ser Ser Ser 

Ser ser Ser Ser sir Olu Olu Gly Pro lie Pro Gly Pro Oln Ser 

245 2b 
Arg Leu Ser Pro Thr Ala Ala Thr Val Gin Phe Lys Cys Gly Ala 

260 

Pr o Ala Ser Thr Pro Tyr Leu Leu Thr Ser Gin Ala Gin Asp Thr 

Ser Gly Ser Pro Ser Glu Arg Ala Arg Pro Leu Pro Gly Ser Olu 
290 

Phe Phe Ser Cys Gin Asn Cys Glu Ala Val Ala Oly Cys Ser Ser 
305 1 

t«„ Pro Glv Asp Glu Asp Lys Pro Tyr Lys 

Gly Leu Asp Ser Leu Val Pro Giy as P * 330 

Cys Gin Leu Cys Arg Ser Ser Phe Arg Tyr Lys Gly Asn Leu Ala 

335 34 
Ser His Arg Thr Val His Thr Gly Glu Lys Pro Tyr His Cys Ser 

350 " 
II. cy» Gly »1. A« Phe As* A« Pro Ala A- « W «« 

_ A„ IX. Hi, s" cly «« W P- ^ W W "» Tte ^ 

380 oo-> 
Oly Ser Arg Phe Val Gin Val Ala His Leu Arg Ala His Val Leu 

395 4tuu 
He His Thr Gly Glu Lys Pro Tyr Pro Cys Pro Thr Cys Gly Thr 

410 41b 
Arg Phe Arg His Leu Gin Thr Leu Lys Ser His Val Arg lie His 

425 4 -* u , 

Thr Gly Glu Lys P«o Tyr His Cys Asp Pro Cys Gly Leu Hxs Phe 

Arg His Lys Ser ill Leu Arg Leu His Leu Arg Gin Lys His Gly 

455 4bU 
Ala Ala Thr Asn Thr Lys Val His Tyr His He Leu Gly Oly Pro 



220 



225 



<210> 36 
<211> 790 
<212> PRT 

<213> Homo sapiens' 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 7510064CD1 



M 4 e° t °Ala 6 Leu Gly Leu Gin Arg Ala Arg Pro Ala Leu Ser Cys Gly 

5 

VaJ He Ser Pro Pro Cys Ala Pro Thr Arg Asn Ser His Pro Gly 

Pro Gly Cys Thr Ala Ser Pro Pro Ala Pro Pro Gly Trp Pro Phe 

35 40 
Ser Gin Arg Oly Pro Oly Arg Trp Ser Thr Thr Glu Leu Arg I*. 

Olu Lys Ser Arg Asp Ala Ala Arg Ser Arg Arg Ser Gin Glu Thr 
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70 








75 


OXU. VaJL 




Gin 


Leu 


a 1 = 

nxa 


His 


Thr 


Leu 


Pro 


Phe Ala 


Arg 


Gly 
















85 








90 


\Ta~\ Cor 






7\ er\ 


Lys 


Ala 


Ser 


lie 


Met 


Arg Leu 


Thr 


He 






95 










100 








105 


ocx ± y jl 




Met 


His 




Leu 


Cys 


Ala Ala Gly Glu Trp 


Asn 






1 1 n 

ilv 










1 1 — ' 








120 


bill val 


Gly Ala 


Gly 


Gly 


Glu 


Pro 


Leu 


Asp 


Ala 


Cvs Tvr 


Leu 


Lys 






125 










1 "}0 








135 


Ala Leu 


Glu Gly 


Phe 


Val 




Val 


Leu 


JL IXJL 


Ala 


Glu Gly 


Asp Met 






140 










lilt: 
1 *± -> 








150 


Ala Tyr 


Leu Ser 


Glu 


Asn 


val 


Ser Lys 


nib 


Leu 


Gly Leu 


Ser 


Gin 






155 










1 fin 








165 


Leu Lj± u. 


Leu lie Gly His 


C? a-y- 
O fcJl 


Tl « 

lie 


irne 


Asp 


Phe 


He His 


Pro 








170 










1 / -J 








180 


Asp uin 


Glu Glu 


Leu 


Gin 


Asp 


Aia 


T at 1 

Leu 


Thr 


Pro 


Gin Gin 


Thr 


Leu 






loD 










1 on 

IjU 








195 


Ser Arg 


Arg Lys 


val 


VjIU 


Al » 
nl a. 


Pro 


inr 


wl U 


Arg 


Cys Phe 


Ser 


Leu 






r\ r\ 










9 n r 








91 0 


Arg Met 


Lys Ser 


Thr 


Leu 


x nr 


Ser 


Arg 


vjiy 


Arg 


Thr Leu 


Asn 


Leu 






215 










£» £t \J 








225 


Lys Ala 


Ala Thr 


Trp 


Lys 


Vai. 


T ,—.1 * 

L»eu 


Asn 


Cys 


Ser Gly His 


Met 


Arg 






23 0 










9*3 
/ j j 








940 


Ala Tyr 


Lys Pro 


Pro 


Ala 




Thr 


Ser 


rro 


Ala 


Gly Ser 


Pro 


Asp 






24b 










TEA 








9RR 




Pro Pro 


Leu 


Gin 




Leu 


Val 


Leu 


He 


Cys Glu 


Ala 


Tl e 
lie 






260 










265 








970 


irro H.1S 


Pro Gly 


Ser 


Leu 


CZ~\ 11 
VJl Li 


Pro 


Pro 


Leu 


Gly 


Arg Gly Ala 


Phe 






275 










280 








9R^ 


Leu. Ser 


Arg His 


Ser 


Leu 


Asp 


Mai* 


Lys 


Phe 


Thr 


Tyr Cys 


Asp 


Asp 






290 










295 








300 


Arg lie 


Ala Glu 


Val 


Ala 




Tyr 


Ser 


Pro 


Asp 


Asp Leu 


He 


Glv 






305 










310 








~> X ~J 


Cys Ser 


Ala Tyr 


Glu 


Tyr 


lie 


All S 


Ala 


Leu 


Asp 


Ser Asp 


Ala 


Val 






320 










325 








330 


Ser Lys 


Ser lie 


His 


Thr 


Leu 


T All 

LGU 


Ser 


Lys 


Gly Gin Ala 


Val 


Thr 






335 










340 








0 *± — > 


Gly Gin Tyr Arg 


Phe 


Leu 


Ala 


Arg 


Ser Gly 


Gly Tyr Leu 


Trp 


Thr 






350 










355 








360 


Gin Thr 


Gin Ala 


Thr 


Val 


Val 


Ser 


Gly 


Gly 


Arg 


Gly Pro 


Gin 


Ser 






365 










370 








375 


Glu Ser 


lie Val 


Cys 


Val 


His 


Phe 


Leu 


lie 


Ser 


Gin Val 


Glu 


Glu 






380 










385 








390 


Thr Gly Val Val 


Leu 


Ser 


J_i su 


Glu 


Gin 


Thr 


Glu 


Gin His 


Ser 


Arg 






395 










400 








405 


Arg Pro 


He Gin Arg Gly 


Ala 


Pro 


Ser 


Gin 


Lys 


Asp Thr 


Pro 


Asn 






410 










415 








420 


Pro Gly 


Asp Ser 


Leu 


Asp 




Pro 


Gly 


Pro 


Arg 


He Leu 


Ala 


Phe 






425 










430 








435 


Leu His 


Pro Pro 


Ser 


Leu 


Ser 


Glu 


Ala 


Ala 


Leu 


Ala Ala 


Asp 


Pro 






440 










445 








450 


Arg Arg 


Phe Cys 


Ser 


Pro 


Asp 


Leu 


Arg 


Arg 


Leu 


Leu Gly 


Pro 


He 






455 










460 








465 


Leu Asp Gly Ala 


Ser 


Val 


Ala 


Ala 


Thr 


Pro 


Ser 


Thr Pro 


Leu 


Ala 






470 










475 








480 


Thr Arg 


His Pro 


Gin 


Ser 


Pro 


Leu 


Ser 


Ala 


Asp 


Leu "Pro 


Asp 


Glu 
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485 










490 






495 


Lieu. 


Pro 


Val 


Glv 


Thr 


Glu 


Asn 


Val 


His 


Arg 


Leu Phe Thr 


Ser Gly 










500 










505 






510 




Asp Thr 


Glu 


Ala 


Val 


Glu 


Thr 


Asp 


Leu 


Asp He Ala 


Gin 


Met 










515 










520 






525 


ax y 


j_i jr & 


Leu 




Leu 




Leu 


Leu Thr Thr Gly Thr Glu 


Leu 


Arg 










530 










535 






540 


ser 


Asp 


uiy 


Ala 


Gly 


Thr 


Ser 


Ala 


Lys 


Val 


His Pro Ser 


Pro 


Arg 










545 










550 






555 


Leu 


X X fc= 


L6U 


Leu 


Pro 


Pro 


Ser 


Cys 


Pro 


Pro 


Gin Asp Ala 


Asp 


Ala 










560 










JD J 






570 


XjSTJ. 


Asp 


Leu 


Glu 


Met 


Leu 


Ala 


Pro 


Tyr 


He 


Ser Met Asp 


Asp 


Asp 










575 










580 






585 


pi-, o 

A lie 


bin 


Leu 


A sin 


Ala 


Ser 


Glu 


Gin 


Leu 


Pro 


Arg Ala Tyr 


His 


Arg 




















595 






600 


x^ro 


Leu 


uiy 


nia 


Val 


Pro 




Pro 


Arg 


Ala 


Arg Ser Phe 


His 


Gly 










DUO 










610 






615 


Leu, 


Ser 


Pro 




Ala 


Leu 


Glu 


Pro 


Ser 


Leu 


Leu Pro Arg 


Trp 


Gly 




















625 






630 


Ocl 


Asp 


riu 


ax. y 


Leu 


Ser 


Cy s 


Ser 


Ser 


Pro 


Ser Arg Gly Asp 


Pro 










O ~J J 










640 






645 


c pr 


Hid 


Coy- 

oex 


Ser 


Pro 


Met 


Ala 


Gly Ala 


Arg 


Lys Arg Thr 


Leu 


Ala 










650 










655 






660 


r;1 -n 

VJ J. 11 


OCl 


Ser 


Glu 


Asp 


Glu 


Asp 


Glu Gly Val 


Glu Leu Leu 


Gly 


Val 










665 










670 






675 


Arg 


±r ro 


Pro 




A Wf 

«x y 


Ser 


Pro 


Ser 


Pro 


Glu 


His Glu Asn 


Phe 


Leu 










680 










685 






690 


Leu 


"DVi a 

irne 


Pro 




Ser 


Leu 


Ser 


Phe 


Leu 


Leu Thr Gly Gly 


Pre 












U J J 










700 






705 


Pro 


oiy 


Ser 


Leu 


VJlU 


Asp 


irro 


Thr 


Glu 


Leu 


Thr Gin Phe 


Leu 


Leu 










/ X\J 










715 






720 


Ser 


Val 


Leu 


oci 


Phe 


Pro 

XT X U 


He 


Leu 


Asp 


Pro 


Tyr Pro Leu 


Gly 


Cvs 










79 R 










730 






735 


Ala 
Ala 


Ala 


Pro 


Gly 


Leu 


His 


Ala 


Ser 


Pro 


Phe 


Ser Leu Pro 


Thr 


He 










7 AO 










745 






750 


Ser 


Val 


Pro 


Gin 


Asn 


Pro 


Leu 


His 


Phe 


Pro 


Pro Gin Pro 


Ser 


Arg 










755 










760 






765 


His 


Ala 


Leu 


Thr 


Leu 


Thr 


Leu 


Pro 


His 


Met 


Phe Gly Ala 


Pro 


Gly 










770 










775 






780 


Ala 


Pro 


Ser 


Pro 


Leu 


Gly Trp 


Phe 


Ala 


He 
















785 










790 









<210> 37 
<211> 1154 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 7490148CB1 



<400> 37 

ageatcaegt gccagggtgg ggggctataa aatacccgag ccgggcgccg gcgggggacg 60 
tgaggacagc cctctccggg gacccctttg ttcccagccc agacgccaac acctctgcgt 120 
ccccaagggc ttgactgccc gtgtctgcgc ggctcccagg gcagagctta gaacactaga 180 
ggagaggggt cgccgcgaac tgeegggget tccagccacc cacccctctc gaeatgtege 240 
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gctccttcta tgtcgactcg ctcatcatca aggacacctc acggcctgcg ccctcgctgc 3 00 

ctgaaccgca ccccgggccg gatttcttca tcccgcttgg catgccgccc ccattggtga 360 

tgtccgtgtc cggccccggc tgcccgtccc gcaagagcgg cgcgttcfcgc gtgtgccctc 420 

tctgcgtcac ttcgcacctg cactcctctc gggggtctgt gggccccgcc agcgggggcg 480 

cagggccggg gtttcccggg cccggagaca gtggggtggc agggcccgca ggggcactgc 540 

ctctgcttaa gggccagttc tcttcggctc ctggggacgc gcagttttgc ccgcgggtga 600 

accatgcgca tcatcaccac cacccgccgc agcaccacca tcaccatcat cagccccagc 660 

agcctggctc ggccgcggcg gcggcagcag cagcagcggc ggcggcggcc gcggcggcct 720 

tggggcaccc gcagcaccac gcacctgtct gcaccgccac cacctacaac gtggcggacc 7 80 

cgcggagatt ccactgcctc accatgggag gctctgacgc cagccaggta cccaatggca 840 

agaggatgag gacggcgttc actagcacgc aactcctgga gctggagaga gaattctctt 900 

ccaacatgta cctgtctcga ctccggagga ttgaaatcgc cacttacctg aacctgtcgg 9 60 

agaagcaggt gaaaatctgg tttcagaacc gccgagtgaa gcacaagaag gaggggaagg 1020 

gcacgcagag gaacagtcac gcgggctgca agtgcgtcgg gagccaggtg cactacgcgc 1080 

gctccgagga tgaggactcc ctgtcgccgg cctcagccaa cgatgacaag gagatttccc 1140 

ccttatgagg gagg 1154 



<210> 38 

<211> 754 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 7490301CB1 



<400> 38 

agccaaaatc fcttgaaaggt tctcattgca 
aatttctgcc tctacagcgc tgtcatggag 
aggcaagcag gtgtgtctgc tgagatgttc 
aaggatggga tccctgagga cctagatggg 
gagctcagga gtgaggatgt catggacctc 
gctcctcctg cagccaaaag acggaaaaca 
actgtggatg cagaggaggc tcagaggatg 
cagctgtccc gctacgaagt gtgtcgccgg 
ctgatgcggt ctatcactgg cagatcggtg 
atagccaagg tctttgttgg agaggtggtg 
ggagaaatgc ccccactgca gcccaagcat 
aagggcctct tccccaacag caactacaaa 
gaggggtctg tttgtgcagg aataagtacc 



gatccataat caaaccaccc cagccagaac 60 

cagacctgaa tctcacccat ggagacaggc 120 

gccatgcccc gagatctgaa gggcagcaac 180 

aacttggaag aacccaggga tcaggaaggt 240 

acagaaggtg acaatgaggc ctcagcctca 300 

gataccaaag gaaagaagga gaggaagccc 3 60 

acaaccctgc tgtctgccat gtctgaggag 420 

tcagctttcc caaaagcatg cattgcgggt 480 

tctgagaacg tggcgattgc catggctgga 540 

gaagaggccc tggacgtgtg tgagatgtgg 600 

ttaagggagg ctgttcgcag gttaaagccc 66 0 

aaaatcatgt tctaggccca aggccagacg 720 
gcat 754 



<210> 39 

<211> 2483 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 2383223CB1 



<400> 39 

gtgcgatcgg gttgtgctta gcttggggtc 
acccctgcat cggatgcgct gtaccctgcg 
ggacacccgc gaggccggaa aatggactca 
agccaggagg agtgggctct gctggctcct 
caggaaacat tcaagaacct ggcatctata 



tcctggcccc ttgacgcgtc aggttgctgt 60 
ctggctccgt gaaccttagg gacaacaccg 120 
gtggcttttg aggatgtgtc tgtgagcttc 180 
tcacagaaga aactctacag agatgtgatg 240 
ggggaaaaat gggaagaccc gaatgttgaa 3 00 
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gatcaacaca aaaaccaagg acgaaatcta agaagccata cgggagagag actctgtgaa 360 
ggtaaagaag gtagtcaatg tgcagaaaac ttcagtccca atctcagtgt gacgaagaag 420 
actgccggag taaaaccata tgagtgtact atctgtggaa aagccttcat gcgtctctca 480 
tcccttacta gacacatgag gtctcacact ggatacgagc tatttgagaa gccatataaa 540 
tgtaaggagt gtgagaaagc ctttagttat ctcaaatcct ttcaaagaca tgaaaggagt 600 
cacactggag aaaaacccta taaatgtaaa caatgtggaa aaaccttcat atatcaccag 660 
ccctttcaaa gacatgagcg gactcacatt ggagaaaaac cctatgaatg taagcaatgt 720 
ggaaaagctc ttagttgttc cagttcgctt cgagttcatg aaaggattca cactggagaa 780 
aagccctatg aatgtaaaca atgtgggaaa gccttcagtt gttccagttc tattcgagta 840 
cacgaaagaa ctcacactgg agagaaaccc tatgcatgta aggaatgtgg gaaagccttc 900 
atttcccaca caagtgttct aacacacatg ataacacaca acggagatag accttataaa 960 
tgcaaagaat gtggaaaggc attcattttt cccagttttt tacgagtaca tgaaagaatt 1020 
cacactggag agaaacccta taaatgtaaa caatgtggta aagccttcag atgttccacc 1080 
tccattcaaa ttcatgaaag aattcatact ggagagaagc cctataaatg taaagaatgt 1140 
gggaaatctt tcagtgcacg cccagccttt cgagtacacg tgagagtgca tactggagag 1200 
aaaccctata agtgtaaaga atgtgggaaa gcctttagta gaatcagtta ctttcgaata 1260 
catgaaagga ctcacactgg agagaaaccc tacgaatgta aaaaatgtgg gaaaactttc 1320 
aattatcctc tagatttgaa aatccacaag agaaatcaca ctggagaaaa accctatgag 1380 
tgtaaggaat gtgcaaaaac cttcatttct cttgagaact ttcgaagaca catgatcacc 1440 
cacactggag acggacctta taaatgtagg gactgtggga aggtgttcat ttttcctagt 1500 
gcgttacgaa cacatgaaag aactcacact ggagagaaac cctatgaatg taaacaatgt 1560 
ggaaaagcct ttagttgttc tagttacatt cggatacata aaagaactca cactggggag 1620 
aaaccttatg aatgtaagga atgcgggaag gcctttattt atcccacaag ctttcaagga 1680 
cacatgagaa tgcatactgg agagaaaccc tataaatgta aagaatgtgg gaaggccttt 1740 
agtcttcaca gttcctttca aagacataca agaattcaca attatgagaa acctcttgaa 1800 
tgtaagcaat gtggaaaagc cttcagtgtg tccacatcct taaaaaaaca tatgagaatg 1860 
cacaatcgat agaaactcta taaatgtgag aaataggaga aagttttcaa ttctaacaga 1920 
tgctttcaaa gttgtgaaaa ttcccactga agagagaaat cctgtcaatg taagt^ratat 1980 
agaaagcgag atacaagatg attcatgtat agtcaggtac cacataatca tgtttcagca 2040 
gcaatggacc atataggttg gtagccccat aagattatat aacacctaaa acatttctat 2100 
caaccgcaat ttagtagcag tagtaacacc atagtgcagc acattattta agtgtttatg 2160 
gtgctggtgt aaacgtgctg cactttcagt tgtataaaag tacaagacag cggccgggga 2220 
cggtggctca cgcctgtaat cccagcactt tgggaggcca aggcgggcgg atcacgaggc 2280 
caggagatca agaccatcct ggctaacacg gtaaaacccc atctctacta aaaatacaaa 2340 
aaattagccg ggcgtggtgg tgggcgcctg tagtcccagc tacttgggag gctgaggcag 2400 
gagaatggca taaacccagg aggccgagct gacagtgagc tgagatccgg ccaacagagc 2460 
gagactctgt ctcaaaaaaa aaa 2483 

<210> 40 

<211> 2535 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 3495982CB1 

<400> 40 

agggtccaag cgggccgcgg ccgctgggac ggaactggtg ccctccgtgg acaattgtgt 60 
tgaagcagaa attgttccgg atctcgggtc ggacacggaa gtcttcctgc agtgtttctg 120 
gatgcgggga cagggatgcg caggaattcc agtctcagtt tccagatgga gcgacccctc 180 
gaggagcaag tccagagcaa gtggtcgtct agtcaaggcc gcacaggaac aggagggtct 240 
gatgtcctcc agatgcagaa cagtgaacac catggacaaa gcatcaagac tcaaactgac 300 
tccatctccc ttgaggatgt ggctgtgaac ttcaccctgg aggagtgggc tttgctggat 360 
cctggccaga ggaatatcta cagagatgtg atgcgggcaa ccttcaagaa cctggcctgt 420 
ataggggaaa aatggaaaga ccaggatatt gaagatgaac acaaaaacca gggaagaaat 480 
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ctaagaagtc ctatggttga agcactctgt gaaaataaag aagattgtcc atgtggaaaa 540 
agcactagcc agattcctga tcttaatacg aacctggaaa ctcctactgg attaaaacca 600 
tgtgactgca gtgtgtgtgg ggaagtcttc atgcatcagg tctcccttaa taggcacatg 660 
agatctcaca ctgaacagaa accaaatgag tgtcacgaat atggagagaa gccacataaa 720 
tgcaaagaat gtgggaaaac cttcactcgc agctccagta ttcgaaccca tgaaagaatt 7 80 
cacactggag agaaacccta tgaatgtaag gaatgtggca aagccttcgc atttctcttt 840 
tcctttcgaa accatataag aattcatact ggagagacac cctatgaatg taaggaatgt 900 
gggaaggcat tcagatatct cactgctctt cggcgccatg aaaaaaatca cactggagag 960 
aaaccctaca aatgtaaaca gtgtggaaaa gcctttatat attaccagcc ttttctaacc 1020 
catgaaagga ctcacactgg agagaaacct tatgaatgta agcaatgtgg gaaagccttc 1080 
agttgtccca cgtacttacg gagtcatgag aaaactcata ctggagagaa accttttgta 1140 
tgtagggaat gtgggagagc cttcttttct cactcaagcc ttcgaaaaca cgtgagccac 1200 
cacacccggc ccccagttct tttttttttt tttgagacgg agtccttgcc caggctggag 1260 
tgcagtggcg cgatctcagc ttactgcaag ctccgcctcc tgggttcacg ccattctcct 1320 
gcctcagcct cccgagtagc tgggactaca ggcgcccgcc accacgcccg gctcattttt 1380 
tgtattttta gtggagacgg ggtttcaccg tgttagccag gatagtcttg atctcctgac 1440 
ctcgtgatcc gccctcctcg gcctcccaaa gtgctgggat tataggcgtg agccaccgcg 1500 
cccggcctcc agttcatttt aaaaccatga aaccacccac actggagagc cgccttctta 1560 
aatgtaagca atatgggaac atcttcaatg acatctgcaa ccattgtcga gattcttcgt 1620 
aggaagaatt aaaggaatgg ggaagaagga cttgggcccc agcaggcctt ttgtggcatc 1680 
cagtcagaag ttaggaaaca ccctctaggg cagtgcctgc cctcagctct cacacatgac 1740 
tggagagaag tgagaagact cccagaatct gccttttaag tttccatcag tgtcctcggg 1800 
tcagactgtg ggaatgctgt acttctaaga caaaggttcg catcttacag tttttcccca 1860 
agtgggatct gcctatgcac atcctaattc acagacattg tcagcattgc attattccag 1920 
ggctgttttg aggtcatctc cagctccatt tcccatctca aacctacact ttggacactt 19 80 
gtttaaacat tcacatctga atttctctgt ggcaaccgca ggtggctctg aggtgagaga 2040 
ctgaagaagc agagcctgta tcatgggaca cacacccctt gaccacacag catccactct 2100 
cctgcggtta ccagccctgc accaaccacc acagtcagaa taatctcaaa actgctagtg 2160 
gagtcacaga cacactctgc catgtcttag ctgtaggaca ctcagaattc ttacctcccc 2220 
taacttgctt tccctgtttt ctggcacatt ctaccctgag atattcagcg gcatatgcac 2280 
tgatgatgct ccggccttct cacgtcagtg catgtcacat gcatcctgct ccaggactgg 2340 
tttcctagcc tcccgctaag actagcactt agctcttctt ggaaaaggac ccaccctgca 2400 
gcgctccagc tctgttgtgt atgggaggtg ctggtgtcag ggttcacaag ccatggaact 2460 
accttgcttc acatcacaag gcaaggtgaa ctcaactccc aataaactgc ttgctggagc 2520 
ctccgaaaaa aaaaa 253 5 

<210> 41 
<211> 2817 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> mis cofeature 

<223> Incyte ID No: 7477 891CB1 

<400> 41 

gtctttgaaa atacttcatt ttcttagcat ttcaggagat tataacatcc tgtatttcag 60 
tttctgagag ctttactgac tgatttccct attcaaaaca atcctcattt cctacatttc 120 
tgaagatctc aagatctgga ctactgttga agaaattccc agtaaggctc acttatatct 180 
ttagagatgg caaataacta caagaaaatt gttctactga aaggattaga ggtcatcaat 240 
gattatcatt ttagaattgt taagtcctta ctgagtaacg atttaaaact taatccaaaa 300 
atgaaagaag agtatgacaa aattcagatt gctgacttga tggaggaaaa gttcccaggt 360 
gatgccggtt tgggcaaact aatagaattc ttcaaagaaa taccaacact gggagacctt 420 
gctgaaactc ttaaaagaga aaagttaaaa gtcaaaggaa taatcccatc taaaaagacg 480 
aaacagaaag aagtgtatcc tgctacacct gcatgcaccc caagcaaccg tctcacagct 540 
aaaggagcag aggagactct tggacctcag aaaagaaaaa aaccatctga agaagagact 600 
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ggaaccaaaa ggagtaagat gtccaaagag cagactcggc cttcctgctc tgcaggagcc 660 
agcacgtcca cagccatggg ccgttcccca cctccccaga cctcatcatc agctccaccc 720 
aacacttcct caactgagag cctaaaacca ttggccaacc gtcacgcaac tgccagtaaa 780 
aatattttcc gagaagaccc aataatcgcg atggtactaa atgcaacaaa agtatttaaa 840 
tatgaatcct cagaaaatga gcaaagaaga atgtttcatg ctacagtggc tacgcagaca 900 
cagttctttc atgtgaaggt tttaaacatc aacttgaaga ggaaattcat taaaaagaga 960 
atcatcatta tatcaaatta ttccaaacgt aatagtctcc tagaggtgaa tgaagcctct 1020 
tctgtatctg aagctggtcc tgaccaaacg tttgaggttc caaaggacat catcagaaga 1080 
gcaaagaaaa ttccgaagat caatattctt cacaaacaaa cttcaggata tattgtatat 1140 
ggattattta tgctacatac gaaaattgta aataggaaga cgacaatcta tgaaattcag 1200 
gataaaacag gaagtatggc tgtagtagga aaaggagaat gccacaatat cccctgtgaa 1260 
aaaggagata agcttcgact cttctgcttt cgactgagaa agagggaaaa tatgtcaaaa 1320 
ctgatgtcag aaatgcatag tttcatccag atacagaaaa atacaaacca gagaagccat 1380 
gactccagga gcatggcact accccaggaa cagagtcagc atccaaaacc ttcagaggcc 1440 
agcacaaccc tacctgaaag ccatctcaag actcctcaga tgccaccaac aactccatcc 1500 
agcagtttct tcaccaagaa aagtgaagac acaatctcca aaatgaatga cttcatgagg 1560 
atgcagatac tgaaggaagg gagtcatttt ccaggaccgt tcatgaccag cataggccca 1620 
gctgagagcc atccccacac tcctcagatg cctccatcaa caccaagcag cagtttctta 1680 
accacgaaaa gtgaagacac aatctccaaa atgaatgact tcatgaggat gcagatactg 1740 
aaggaaggga gtcattttcc aggaccgttc atgaccagca taggcccagc tgagagccat 1800 
ccccacactc ctcagatgcc tccatcaaca ccaagcagca gtttcttaac cacgttgaaa 1860 
ccaagactga agactgaacc tgaagaagtt tccatagaag acagtgccca gagtgacctc 1920 
aaagaagtga tggtgctgaa cgcaacagaa tcatttgtat atgagcccaa agagcagaag 1980 
,aaaatgtttc atgccacagt ggcaactgag aatgaagtct tccgagtgaa ggtttttaat 2040 
attgacctaa aggagaagtt caccccaaag aagatcattg ccatagcaaa ttatgtttgc 2100 
cgcaatgggt tcctggaggt atatcctttc acacttgtgg ctgatgtgaa tgctgaccga 2160 
aacatggaga tcccaaaagg attgattaga agtgccagcg taactcctaa aatcaatcag 2220 
ctttgctcac aaactaaagg aagttttgtg aatggggtgt ttgaggtaca taagaaaaat 2280 
gtaaggggtg aattcactta ttatgaaata caagataata cagggaagat ggaagtggtg 2340 
gtgcatggac gactgaccac aatcaactgt gaggaaggag ataaactgaa actcacctgc 2400 
tttgaattgg caccgaaaag tgggaatacc ggggagfctga gatctgtaat tcatagtcac 2460 
atcaaggtca tcaagaccag gaaaaacaag aaagacatac tcaatcctga ttcaagtatg 2520 
gaaacttcac cagacttttt cttctaaaat ctggatgtca ttgacgataa tgtttatgga 2580 
gataaggtct aagtgcctaa aaaaatgtac atatacctgg ttgaaataca acactataca 2640 
tacacaccac catatatact agctgttaat cctatggaat ggggtattgg gagtgctttt 2700 
ttaatttttc atagtttttt tttaataaaa tggcatagtt ttgcatctac aacttctatt 2760 
aattcggaag ataaataact attccagagg ggggggggtc gggggggggg gagagga 2817 

<210> 42 

<211> 2295 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> mi sc — feature 

<223> Incyte ID No: 72688352CB1 

<400> 42 

caggaaagtg caggttctgg gaagtctcgg tgggttcccc gcaaaatcag agatggcatc 60 

tcactatgtt gcccaggctg atcttgaact cctgacctca agtgatcctc tctccttggc 120 

ctcctaaagt gctgggattc caggtgtgag ccacttcacc cagccacatt tatccatctt 180 

gtaaaatatt gctttgtttt ttggtgcatt tcaaacccac aggtcctgac ttcaggagca 240 

agacacttca agtgaggtga ggaggaggtt ccaggacctg gataccatcc tttgcaggga 300 

tgtaattcac cagagaccaa gatttctagc cagaaattga gggcagctca aggagagact 360 

acagacacag catcctactt actggctatt tcccagtaac aaaagaaaat gatcaagtcc 420 

caggaatcac tgaccctgga ggatgtggct gtggagttca cttgggagga gtggcagctc 480 
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ctcggccctg ctcagaagga cctgtaccga 
gtgtcagtgg ggtatcaagc cagcaaacca 
ccatggacag tagaaaatga aatccacagc 
aatcatctac agatgcactc acaaaagcaa 
aaacataatg catttggaaa catcattcat 
aatcatgata catttgactt acatgggaaa 
cagaacaaaa ggtatgaaat caagaattct 
cttcatgcca agcatgaaca atttcataat 
gtgaatacaa attcacaatt cattaagcat 
gtatgcactg agtgtgggaa ggctttcctc 
gttcacactg gggagaaacc tcatggatgc 
tccgggctca ctgaacacca gagaaaccac 
tgtgacaaag cattccgctg gaaatcacag 
gagaagtcat atatatgcag tgattgtgga 
aatcatcaga gagttcatac aggagagaaa 
ttctccaaaa ggtccaggct cactgaacac 
gaatgcactg aatgtgacaa agcattccgc 
gctcacacag gagagaagtc atatatatgc 
ggaaatctca ttgtacatca gcgaattcat 
tgtggaaaag gcttcatcca aaagggcaac 
gagaaaccct atgtatgcaa tgaatgtggg 
tcccatcaga gatttcacac aggaaagaca 
tgctcacaca agtcaggtct cattaaccac 
acatgcagtg actgtgggaa agctttcaga 
actcatacag gggagagacc gtatggatgc 
tcatgccttg tttatcataa gggaatgctg 
aaattggaaa atccttgctc agagagtcat 
gataaagact ctgttaacat ggtgactctg 
ttaactaaca gtgcgttcca agcagagagc 
agaagttcag tctcagcaga tagtagaatt 
aaaaaaaaaa aaaaa 



gacgtgatgt tggagaacta tagcaacctc 540 
gatgcactct tcaagttgga acaaggagag 600 
caaatctgtc cagaaatcaa gaaagttgac 660 
agatgtctga agagagtgga acaatgccat 720 
cagaggaaaa gtgattttcc tttaaggcaa 780 
atactgaaat caaatttaag tttagtcaac 840 
gtgggggtta atggagatgg gaaatccttc 900 
gaaatgaact tccccgaagg tggaaattct 960 
cagcgaactc aaaacataga taaaccccat 1020 
aagaagtctc gcctcatcta tcatcagaga 1080 
agtatatgtg ggaaagcctt ctccagaaag 1140 
acaggagaga aaccctatga atgcactgaa 1200 
ctcaatgcac accagaaaat tcatacagga 1260 
aaaggcttca tcaagaagtc tcggctcatt 1320 
ccacatggat gcagcctgtg tgggaaggcc 1380 
cagagaactc atacaggaga gaaaccctat 1440 
tggaaatcac agctcaatgc acatcagaaa 1500 
cgtgattgtg gaaaaggctt cattcagaag 1560 
actggagaaa aaccctatat atgcaatgaa 1620 
ctccttattc atcgacgtac tcacactgga 1680 
aaaggcttca gccagaagac atgtttaata 1740 
ccctttgtat gtactgagtg tggaaaatcc 1800 
cagagaattc acacaggaga gaaaccctat 1860 
gataaatcat gtctcaacag acatcggaga 1920 
tctgattgtg ggaaagcttt ctcccacttg 1980 
catgcaagag agaaatgtgt aggttcagtc 2040 
agcttatcac atacacgtga tctcatacag 2100 
cagatgcctt ctgtggcagc tcagacctca 2160 
aaagtagcca ttgtgagcca gcctgttgcc 2220 
tgcacagaat aaaaaccata tgaatgcaaa 2280 

2295 



<210> 43 
<211> 958 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> mis cofeature 

<223> Incyte ID No: 7490652CB1 

<400> 43 

aatacttgag gagatggata acctattttc 
gtatcaaaac r atgtagcctt ttggctctgt 
gcaccttatg aaaggtggca aaaagggagc 
gaaagattgg tgtgatgtaa aagcacttgc 
gctagtcacc aggactcgag gaaccaaaat 
tgaagtgagt cttgctgatc tgcagaatga 
tgctgaagat gttcagaaaa aaactaactt 
ttccgtggtc aaaaaatggc agaccatgat 
tggttatttg tttcatctac tctgtgattt 
ctcttatgct cagcaccaac aggtctgcga 
caaaggtgca aatgacttga aagaagtggt 
agaaaagctt tgcctatcta tttatcttct 
gctgaagatg cccaagtttg atttgggaaa 
tggggatgag acaggtgcta aagtcgaatt 



catgacatga ttattacgca ttgcatgcct 60 
gagcagcacc atggctgttg gcaagaacaa 120 
cgaaaataga gtggttgatc cattttctaa 180 
tatgttcaat ataagaaata ttggagagac 240 
tgcatctgat agcctcaagc gtcgtgtgtt 300 
tgaagttgca tttagaaaat tcaagctgat 360 
ccagggcatg gatttacccg acgaaatgtg 420 
tgaacctcac attgatgtca agactaccga 480 
tactaaaaaa cacaatctga tacagaaggc 540 
aatccagaag aaaatgatgg aaattatgac 600 
caataaattg attccaggca gcactggaaa 660 
ccatgatgtc tttgttagaa aagtaaaaat 720 
attcatgggt aattgttctg gaaaagccac 780 
agctgatgga tatgaagcac tggtccaaga 840 
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atctgtttaa agttcaaaat ttaaaaaaaa ggccafccctg ggccataaag caaatctctg 900 
tgaaaaaaat gaaatcatac aaattatttt ctctgatcat aatggaatta aactaaaa 958 

<210> 44 

<211> 1978 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> mi sc_f eature 

<223> Incyte ID No: 7489744CB1 

<400> 44 

ggaggaacac gctgacacgc gggcggcggt gttataacgg tctaaggtag cgactfctatg 60 
aagtcctatc ataaagaaat gaagtaaaat ttcctggaca aatctcccaa tcgctccctt 120 
acatatagtt tgaaatccct tgcccctctc ttccttgtac ttgcttggta tacacataac 18 0 
aggtttaaat ccagttctgt acctattcca ggctttcttt ccctttcagc ttactgtggc 240 
tggagaaaac acataccacc ctaatgacta atctcactcc acaaccattc acaacattga 300 
tccttcaaaa ggtcaggctc tcatactatt ctcccagtta cctaaatggc tattttgtgt 360 
gtcctcttcc ctctcctcct cctcctgtcc ccatcctcct cctgtcccca tcctcctcca 420 
atcttcactt tctgctcatg tctttttatt ttaattcact gagaaaatgg aagtaatagt 480 
agaaaactta cacctaccta ccagccccat ccccccggta gcgggagcgg agagcggacc 540 
ccagagagcc ctgagcagcc ccaccgccgc cgccggccta gttaccatca caccccggga 600 
ggagccgcag ctgccgcagc cggccccagt caccatcacc gcaaccatga gcagcgaggc 660 
cgagacccag cagccgcccg ccgccccccc cgccgccccc gccctcagcg ccgccgacac 720 
caagcccggc actacgggca gcggcgcagg gagcggtggc ccgggcggcc tcacatcggc 780 
ggcgcctgcc ggcggggaca agaaggtcat cgcaacgaag gttttgggaa cagtaaaatg 840 
gttcaatgta aggaacggat atggtttcat caacaggaat gacaccaagg aagatgtatt 900 
tgtacaccag actgccataa agaagaataa ccccaggaag taccttcgca gtgtaggaga 960 
tggagagact gtggagtttg atgttgttga aggagaaaag ggtgcggagg cagcaaatgt 1020 
tacaggtcct ggtggtgttc cagttcaagg cagtaaatat gcagcagacc gtaaccatta 1080 
tagacgctat ccacgtcgta ggggtcctcc acgcaattac cagcaaaatt accagaatag 1140 
tgagagtggg gaaaagaacg agggatcgga gagtgctccc gaaggccagg cccaacaacg 1200 
ccggccctac cgcaggcgaa ggttcccacc ttactacatg cggagaccct atgggcgtcg 1260 
accacagtat tccaaccctc ctgtgcaggg agaagtgatg gagggtgctg acaaccaggg 1320 
tgcaggagaa caaggtagac cagtgaggca gaatatgtat cggggatata gaccacgatt 1380 
ccgcaggggc cctcctcgcc aaagacagcc tagagaggac ggcaatgaag aagataaaga 1440 
aaatcaagga gatgagaccc aaggtcagca gccacctcaa cgtcggtacc gccgcaactt 1500 
caattaccga cgcagacgcc cagaaaaccc taaaccacaa gatgggcaag agacaaaagc 1560 
agccgatcca ccagctgaga attcgtccgc tcccgaggct gagcagggcg gggctgagta 1620 
aatgccggct taccatctct accatcatcc ggtttagtca tccaacaaga agaaatatga 1680 
aattccagca ataagaaatg aacaaaagat tggagctgaa gacctaaagt gcttgctttt 1740 
tgcccgttga ccagataaat agaactatct gcattatcta tgcagcatgg ggtttttatt 1800 
atttttacct aaagacgtct ctttttggta ataacaaacg tgttttttaa aaaagcctgg 1860 
tttttctcaa tacgccttta aaggttttta aattgtttca tatctggtca agttgagatt 1920 
tttaagaact tcatttttaa tttgtaataa aagtttacaa cttgattttt tcaaaaaa 1978 

<210> 45 

<211> 2859 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 3363382CB1 
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<400> 45 

ccggcgatga gcggcgggag cgcgtgagcg agccgggcgg gcgaggctgg ggggccccgg 60 
agcgcgagcc ggagggcggg ggcgcgagcg cggcgggcgg tggcggcgag gcggcggcct 120 
agaagatggc ggacggcgac agcggcagcg agcgcggcgg cggcggtggg ccgtgcgggt 180 
tccagcccgc gtcccgcggc ggcggcgagc aagagacgca ggagctggcc tcgaagcggc 240 
tggacatcca gaacaagcgc ttctacttag atgtgaagca gaacgccaag ggccgcttcc 300 
tcaagatcgc cgaggtgggc gcgggcggtt ccaagagccg cctcacgctg tccatggcgg 360 
tggccgccga gttccgcgac fccgctgggcg acttcataga acactacgcg cagctgggcc 420 
ctagcagccc cgagcagctg gcggcgggcg ccgaggaggg cggcgggccg cggcgcgcgc 480 
tcaagagcga attcttggtg cgtgagaacc gcaagtacta cctggacctc aaggagaacc 540 
agcgcggccg cttcctgcgc atccgccaaa cggtcaaccg cggcggtggc ggcttcggcg 600 
cgggccccgg gccgggcggc ttgcagagcg gccagaccat cgcgctgcct gcgcagggcc 660 
tcatcgagtt ccgcgacgcg ctggcgaagc tcatagacga ctacggaggc gaggacgacg 720 
agctggcagg cggcccggga ggcggcgccg ggggcccagg gggcggcctg tatggagagc 7 80 
tcccggaggg cacctccatc accgtggact ccaagcgctt cttcttcgat gtgggctgca 840 
acaaatacgg ggtgtttctg cgagtgagcg aggtgaagcc gtcctaccgc aatgccatca 900 
ccgtaccctt caaagcctgg ggcaagttcg gaggcgcctt ttgccggtat gcggatgaga 960 
tgaaagaaat ccaggaacga cagagggata agctttatga gcgacgtggt gggggcagcg 1020 
gcggcggcga agagtcagag ggtgaggagg tggafcgagga ttgaaacggg cagcttcccc 1080 
tacaggcctc cacccaacca ccatcccctt ggctagagaa ttccctttcc tgctcatccc 1140 
cagtgagcta gtggagaggg agcaagggag gccgcagagg gaaaaaacaa aacgtaacac 1200 
agttaagaga aataatcgta agagaacagt gacggacaac tfcgagaaaag cagtcaagtt 1260 
ccaaggaact gacagcaacc tgcaaagagg aaaacagcat ctcctcacct gcgtaaaatt 1320 
gtctcagctt ctgttgtttc tcaactgagg ttcgtaaacc catcaggata atccctggag 1380 
ggaatagatc cttgcacatc cagggcaaga aacatgtcca agttacccag accattgata 1440 
acagttgcat ttaggttgca cctgggtaat ctggcataaa agatctctct aggcctcact 1500 
gttgcggtgt ctatcccttc acctccattg aaatcagcat tttggatcta ggtcttcatg 1560 
gaatccttga gaagagaggc ctttacaatt acccagttct gagggttcag gttcacgaaa 1620 
agaaatgcaa cttgggataa tcatgaacag gttaaagata agatttcaag aagccatcta 1680 
agaatacaga accaaattgg atccattttt ttaaaaaaat ggttttgcat. ggaacctgga 1740 
ccaaggcaaa tgtcttttct tcgcagaatt gttttccagg atgccagtgg attcagatag 1800 
caatgcttgg agtagaatcc gttactaaaa tagtttcaaa gttgacaaaa aattttcaaa 1860 
gataaaagca gttttacatt gggggttgct gaggtaggca caagaaaaag tcaggcataa 1920 
agcacaaggc agactgtttg agtggattgg ttgctgctca ctaaagttgt tcccctgatc 1980 
tctaaatatg gaggtcatta ccaagaaatg ctttggtatg aatgagagcc agatctccac 2040 
tgtgtgagcc agtgaattat ggctaattcg gctgttacag ccactggttg gctggatttt 2100 
aaaccataaa acttgaagat tacctacaaa agtaacagtg tggctataag cctgagcttt 2160 
aatggatata catcctcaca gaaaagttgg aaataaccaa aactgaagtc ttaatttacc 2220 
ttcagtttaa tctgtggatt tgttcaaata ctaaagatcc tcaggtccag aattccagca 2280 
tcatttattc ttttaaaatt tttaagaact tgatccattg tatcagtacc tcacaatcag 2340 
agttggcaaa tgatggatga gtgattcaag cagtgcaccc ggtggaagct gaaatccatc 2400 
tgtgaatgga actgaagtga acgtgaatat gctgactata tcctggaagc atttttatac 2460 
catcttgaaa tttcaacaaa ctggcttttg ccagttaatc cagctgtctt tcaagaataa 2520 
aagttggggt tttcaaggat cgcctcttct atattttaaa tggattttca gtagaaatga 2580 
tttttactaa tcaagttaat cccaccccat caaaaggtat tcctagaaat gtcatagacc 2640 
taggtaactt tgaattgaat gggagctaac gttctttcca aagttttcag gtattctttg 2700 
tgtgacacct tctcaaccag gaggcaagta accccgcctc cacaatctta gtattttttt 2760 
taaactgcat gcctgcccct tatttgagct gcctttttaa tttattgcat atccttttta 2820 
ttatcttatt ttggtattat tcaatctata caatctttt 2859 

<210> 46 

<211> 1772 

<212> DNA 

<213> Homo sapiens 

<220> 
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<221> raisc_ feature 

<223> Incyte ID No: 7491148CB1 



<400> 46 

atgaaggacc acgacgccat caagctcttc 
caggacctca agccgctgtt cgaggagttc 
gaccggctca ccggcctcca caaaggctgt 
gctctcaagg cccagagtgc actgcacgag 
aacaacaata acaacaacaa aaaccggcca 
ggcaagcagc agggtgagga ggacgtcaga 
gagtgcacgg tcctgcggag tcctgacggc 
gggagtcaag gggaagctca ggcggccatc 
ggcgcctcgt ccagcctcgt ggfccaagctg 
cggatgcagc agatggccgg ccacctgggc 
gcctgcggcg cctacaccac ggcgatcctg 
cagggcccag gcctaggccc ggtggcggca 
tttagcctgg tagctgcgcc tctgttgccc 
ggccctggca ccctcccagg tcttccggcg 
gacaccccca gatccaatgg ccagccgggc 
ccttatccag cccagagccc cggcgtggct 
caccactacg cagcagccta tccgtcggcc 
cagccttcag ccctgcccca gcagcagaga 
tatcacctgc ctcaggagtt tggtgatgcg 
gccgttgtct cfcgctaaagt ctttgtggat 
tttgttagtt ttgacaatcc aactagtgcc 
caaattggca tgaagaggct caaggtccag 
tactgacctg ctttcactga ccagccacag 
ggaaaagcac agaaacgctt gagcagccct 
atcggaccca aggctggtgc ctggggctaa 
ggttgttctg tgcctgcagc atagagcgca 
gtgactgtcc aggggaacca gcagagggcg 
aagcccagat ttacttcttt caaaatcata 
tattgctttt ttaagaatat atatatctat 
cgccgttact actgatccgg ctcgtccctc 



gtggggcaga tcccgcgggg cttggacgag 60 
ggccgcatct acgagctgac ggtgctgaag 120 
gccttcctca cctactgcgc ccgggactct 180 
cagaagaccc tgccagggtt tcatatcctt 240 
gaggaccgaa agctgtttgt ggggatgctg 300 
cgcctgttcc agccctttgg ccacatcgag 360 
accagtaaag gctgtgcctt tgtgaagttc 420 
cggggtctgc acggcagccg gaccatggcg 480 
gcggacaccg accgggagcg cgcgctgcgg 540 
gccttccacc ccgcgccact gccgctaggg 600 
cagcaccagg cggccctgct ggcggcggca 660 
gtggcggccc agatgcaaca cgtggcggcc 720 
gcggcagcag ccaactcccc gcctggcagc 7 80 
cccatcgggg tcaatggagt tcggccctct 840 
tccgacacgc tctacaataa cgggctctcc 900 
gaccccctgc agcaggccta cgctgggatg 960 
tatgccccag tgagcacagc ttttccccag 1020 
gaaggccccg aaggctgtaa cctcttcatc 1080 
gaactcatac agacattcct gccctttgga 1140 
cgagccacca accagagcaa gtgttttggg 1200 
cagactgcta ttcaggcgat gaatggcttt 1260 
ctaaagcggc ccaaggatgc caaccggcct 1320 
aaagaaacag aagagtgaga agaaaggaga 1380 
tcccgaagga gcagctgcgg acggaggtgg 1440 
ggccactcta aggattgttt ttatcaagtg 1500 
ggctggcaga gcaaataggg ctggtgagga 1560 
ttgggggtgc caagggcttc tccgcaaggg 1620 
tcattcctta gagtttaggg accaaaggac 1680 
ataaattaaa acaaactttc cagcacactg 1740 



<210> 47 

<211> 3112 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 8126343CB1 



<400> 47 

atggcgaccg accttcccat catggcgcgt 
gggagcagtt ccgggtgcgg tgcgcgccag 
gccggcctgt cggacctgga actgcggcgg 
cccatcaccg acaccacccg ggatgtctac 
gcccggctgc gcgacgagga gcggctgcgg 
ttacgggaag aggcccggtt acgcgaggat 
tctccgcggg cggagccctg gctctcccag 
ggggcctacg gtgatatccg gccctccgcg 
tatcctgccc gcccggcgca actcaggcgc 
gacgaggacg cccggacgcc cgacagggcc 
tggtgggcag cgtctcccgc cccggcgcgg 



ggccccgccc gctccgccgc gcctgcggga 60 
gggcgggcgg ggggcggcgt cctggccatg 120 
gagctgcagg ccctgggctt ccagccagga 180 
cgcaacaagc tgcgccgcct gcggggcgag 240 
gaggaggccc ggccgcgggg cgaggagcgg 300 
gcgccgctgc gcgcccggcc cgccgcggcc 360 
ccggcctcgg gctcggccta cgcgacccct 420 
gcttcctggg tagggagccg cggcctcgcc 480 
cgcgcctcgg tccggggcag ctccgaggag 540 
acgcagggcc cgggtctcgc ggcccgccgc 600 
ctgccttcct ccctcctcgg tcccgacccg 660 
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cgcccgggcc tgcgggcgac tcgagcgggc cctgctggcg cggcgagggc ccggcctgag 720 
gtggggcgcc ggctggagcg ctggctctct cggcttctgc fcctgggccag cctagggcta 780 
ctgctcgtct tcctgggcat cctttgggtg aagatgggca agccctcagc gccgcaggag 840 
gcggaggaca acatgaagfct attgccagtg gactgtgaga gaaaaacaga tgagttctgt 900 
caggccaagc agaaggcagc cttgctggag ctgctgcatg aactctacaa tttcctggcc 960 
atccaagctg gtaattttga gtgtggaaat ccagagaatc taaaaagcaa atgcattcct 1020 
gttatggaag cccaagaata tatagccaat gtgaccagca gctcctccgc caagtttgaa 1080 
gccgcactga cctggatact gagcagtaac aaggacgtgg gcatctggtt gaaaggagaa 1140 
gaccagtctg aattggtgac gactgtggac aaggtggtct gcctggaatc tgcccacccc 1200 
cgcatgggtg ttggctgccg cctgagccgg gccttgctca ctgctgtcac caacgtgctc 1260 
atcttcttct ggtgcttggc ttttttgtgg gggctcctaa ttctcctaaa atatcggtgg 1320 
cgaaagttag aagaggagga acaagccatg tatgagatgg tgaagaagat tatagacgtg 1380 
gtccaggacc attacgtgga ctgggagcag gacatggagc gctatccata tgtaggcatc 1440 
ctgcacgtgc gcgacagctt gatccctcca cagagccggt gagtccctgg ctggggcgct 1500 
gggtgtgagg taagatgggg tccgacccca ctgagctccc cagaaggaaa cccagagcct 1560 
tgcatcagat gtcagctgag ggcagtgcag accccagggc caagagtggc tctgctccca 1620 
ctcctgctgc fcccccagcat gctcggcfcct ttatggcagg atgagctaga aaagacatat 1680 
gaaagcacaa tcccacacct ctgtagagag aacactggca gtggtgatac cagttaggag 1740 
gccttgggct ccaagtaata gcagctccct cccccagggc acaagtagtg aaggatctcc 1800 
tcagtctctg taagtggaag ttcagagctg catgggctcc aggcacagaa cagtcaggtc 1860 
tccggttcca ttgctcttgt gattctcgtg gtcccacact cctccatgga tgttatcaac 1920 
agactggggt ccctcatggt tgcagatggt ggccaacagc agctggggca ctgtctgcct 1980 
tgtcacagcc agcaggggaa aagtaggcct attgttgtct tataagcaag agaacatgtt 2040 
cccagtagcc ttcagtgaca cactccttga ggctttattg accagaatct cttacatgtc 2100 
tgtttatgaa tcagtcactg gcaaggagga ggaaattgtt tttcaaactt tttaaaccat 2160 
gacctgcagt caaaaacacc tgttacctgc aacatcatgg tccagttccc atacacgcat 2220 
gcgtacactc atgcatatac atgtaattga aagcaaagtt tcacaacatg aatggggtca 2280 
gcttccaggg cacatagctt catgggaagg ggtggcaggg ccactgtcat gtgcagctgc 2340 
agggcacctt ccacatgggc tgtgactgct ggagttgtac actggggtcc tgacagaacc 2400 
agggtggggg agggcagagt cccaggagac agccagcagg gcccccccca gtgttggctc 2460 
tggttgtggt ccctccctag cagacacttg ctgtggtttc tggccagatg ctgtcatctg 2520 
ctgggtgatg ctgagagctc acccttgctg agtgctcgct gggagccggg cactacttgg 2580 
agctctccac agatgcaaac tcactcactt ggtcctcacc atggttcacg cacacaacag 2640 
aagtggcccc attttaccta taagggaact gaggcacagg gaagtagagg aatgtgctca 2700 
agagtcccat ggctagtggc atttccgtgc tcccgccacc aggctctact gtcaacatca 2760 
tccctatctg caacccaaga ggaatcagaa aaatagagcc gttagcagtc cactttaagt 2820 
caggcacagt ggctcacacc tgtaatccca gcattttggg aggccaaagt gggaggattg 2880 
cttgagccca tgagttcaag accagcctgg acaacatagc aaaaccccat ctctacaaaa 2940 
aataaatatt agtgggcatg gtgacacatg cctgtagtcc cagctactca ggaggctgag 3000 
gcaggaggat cactcgagcc caggaggtcg aggcgacagt gagctataac tatgccactg 3060 
cacatcagcc tggacaacag agcaagaccc tatctcttaa aaaaaaaaaa aa 3112 

<210> 48 

<211> 1939 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 7044055CB1 

<400> 48 

gtggtaagat ggcggctgtg agtctgcggc tcggcgactt ggtgtggggg aaactcggcc 60 
gatatcctcc ttggccagga aagattgtta atccaccaaa ggacttgaag aaacctcgcg 120 
gaaagaaatg cttctttgtg aaattttttg gaacagaaga tcatgcctgg atcaaagtgg 180 
aacagctgaa gccatatcat gctcataaag aggaaatgat aaaaattaac aagggtaaac 240 
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gattccagca agcggtagat gctgtcgaag agttcctcag gagagccaaa gggaaagacc 300 
agacgtcatc ccacaattct tctgatgaca agaatcgacg taattccagt gaggagagaa 360 
gtaggccaaa ctcaggtgat gagaagcgca aacttagcct gtctgaaggg aaggtgaaga 420 
agaacatggg agaaggaaag aagagggtgt cttcaggctc ttcagagaga ggctccaaat 480 
cccctctgaa aagagcccaa gagcaaagtc cccggaagcg gggtcggccc ccaaaggatg 540 
agaaggatct caccatcccg gagtctagta ccgtgaaggg gatgatggcc ggaccgatgg 600 
ccgcgtttaa atggcagcca accgcaagcg agcctgttaa agatgcagat cctcatttcc 660 
atcatttcct gctaagccaa acagagaagc cagctgtctg ttaccaggca atcacgaaga 720 
agttgaaaat atgtgaagag gaaactggct ccacctccat ccaggcagct gacagcacag 780 
ccgtgaatgg cagcatcaca cccacagaca aaaagatagg atttttgggc cttggtctca 840 
tgggaagtgg aatcgtctcc aacttgctaa aaatgggtca cacagtgact gtctggaacc 900 
gcactgcaga gaaatgtgat ttgttcatcc aggagggggc ccgtctggga agaacccccg 960 
ctgaagtcgt ctcaacctgc gacatcactt tcgcctgcgt gtcggatccc aaggcggcca 1020 
aggacctggt gctgggcccc agtggtgtgc tgcaagggat ccgccctggg aagtgctacg 1080 
tggacatgtc aacagtggac gctgacaccg tcactgagct ggcccaggtg attgtgtcca 1140 
ggggggggcg ctttctggaa gcccccgtct cagggaatca gcagctgtct aatgacggga 1200 
tgttggtgat cttagcggct ggagacaggg gcttatatga ggactgcagc agctgcttcc 1260 
aggcgatggg gaagacctcc ttcttcctag gfcgaagtggg caatgcagcc aagatgatgc 1320 
tgatcgtgaa catg'gtccaa gggagcttca tggccactat tgccgagggg ctgaccctgg 1380 
cccaggtgac aggccagtcc cagcagacac tcttggacat cctcaatcag ggacagttgg 1440 
ccagcatctt cctggaccag aagtgccaaa atatcctgca aggaaacttt aagcctgatt 1500 
tctacctgaa atacattcag aaggatctcc gcttagccat tgcgctgggt gatgcggtca 1560 
accatccgac tcccatggca gctgcagcaa atgaggtgta caaaagagcc aaggcgctgg 1620 
accagtctga caacgatatg tccgccgtgt accgagccta catacactaa gctgtcgaca 1680 
ccccgccctc acccctccaa tcccccctct gaccccctct tcctcacatg gggtcggggg 1740 
cctgggagtt cattctggac cagcccacct atctccattt ccttttatac agactttgag 1800 
acttgccatc agcacagcac acagcagcac ccttcccctg aggccggtgg ggaggggaca 1860 
agtgtcagca ggattggcgt gtgggaaagc tcttgagctg ggcactggcc ccccggacga 1920 
ggtggctgtg tgttcacac 1939 

<210> 49 
<211> 5902 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7493424CB1 

<400> 49 

gtggtttgag gctctgcgaa gttacagagg ctcggaggcg cttgcaaaaa tgtggctgtc 60 
agagagggaa gtgcacatga atttgaagac agcattagaa agcgttccgc ccacagtttc 120 
tctgttctca acggctcagt tttaacagga taaattttaa gttaagtccc atatgaaggc 180 
tcaaaagagc ggtaaagaac aacagcttga cattatgaac aagcagtacc aacaacttga 240 
aagtcgtttg gatgagatac tttctagaat tgctaaggaa acggaagaga ttaaggacct 300 
tgaagaacag cttactgaag gccagatagc agcaaatgaa gccctgaaga aggatttaga 360 
aggtgttatc agtgggttgc aagaatacct ggggaccatt aaaggccagg caactcaggc 420 
ccagaatgag tgcaggaagc tgcgggatga gaaagagaca ttgttgcaga gattgacaga 480 
agtcgagcag gagagagacc agctggaaat agttgccatg gatgcagaaa atatgaggaa 540 
ggagcttgca gagctagaaa gtgccctcca agagcagcat gaggtgaatg catctttgca 600 
gcagacccag ggagatctca gtgcctatga agctgagcta gaggctcggc taaacctaag 660 
ggatgctgaa gccaaccagc tcaaggaaga gttggaaaaa gtaacaagac ttacccagtt 720 
agaacaatca gcccttcaag cagaacttga gaaggaaagg caagccctca agaatgccct 780 
tggaaaagcc cagttctcag aagaaaagga gcaagagaac agtgagctcc atgcaaaact 840 
taaacacttg caggatgaca ataatctgtt aaaacagcaa cttaaagatt tccagaatca 900 
ccttaaccat gtggttgatg gtttggttcg tccagaagaa gtggcagctc gtgtggatga 96 0 
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gctaagaaga aaactgaaat fcaggaactgg ggaaatgaac atccatagtc cttcagatgt 1020 
cttagggaaa agtcttgctg atttacagaa acaattcagt gaaattcttg cacgctccaa 1080 
gtgggaaaga gatgaagcac aagttagaga gagaaaactc caagaagaaa tggctctgca 1140 
gcaagagaaa ctggcaactg gacaagaaga gttcaggcag gcctgtgaga gagccctgga 1200 
agcaagaatg aattttgata agaggcaaca tgaagcaaga atccagcaaa tggagaatga 1260 
aattcactat ttgcaagaaa atctaaaaag tatggaggaa atccaaggcc ttacagatct 1320 
ccaacttcag gaagctgatg aagagaagga gagaattctg gcccaactcc gagagttaga 1380 
gaaaaagaag aaacttgaag atgccaaatc tcaggagcaa gtttttggtt tagataaaga 1440 
actgaagaaa ctaaagaaag ccgtggccac ctctgataag ctagccacag ctgagctcac 1500 
cattgccaaa gaccagctga agtcccttca tggaactgtt atgaaaatta accaggagcg 156 0 
agcagaggag ttgcaggaag cagagaggtt cagcagaaag gcagcacaag cagccagaga 1620 
tctcacccga gcagaagctg agatcgaact cctgcagaat ctcctcaggc agaaggggga 1680 
gcagtttcga cttgagatgg agaaaacagg tgtaggtact ggagcaaact cacaggtcct 1740 
agaaattgag aaactgaatg agacaatgga acgacaaagg acagagattg caaggctgca 1800 
gaatgtacta gacctcactg gaagtgacaa caaaggaggc tttgaaaatg ttttagaaga 1860 
aattgctgaa cttcgacgtg aagtttctta tcagaatgat tacataagca gcatggcaga 192 0 
tcctttcaaa agacgaggct attggtactt tatgccacca ccaccatcat caaaagtttc 1980 
cagccatagt tcccaggcca ccaaggactc tggtgttggc cttaagtact cagcctcaac 2040 
tcctgttaga aaaccacgcc ctgggcagca ggatgggaag gaaggcagtc aacctccccc 2100 
tgcctcagga tactgggttt attctcccat caggagtggg ttacataaac tgtttccaag 216 0 
tagagatgca gacagtggag gagatagtca ggaagagagt gagctggatg accaagaaga 222 0 
acccccattt gtgcctcctc ctggatacat gatgtatact gtgcttcctg atggttctcc 2280 
tgtaccccag ggcatggccc tgtatgcacc acctcctccc ttgccaaaca atagccgacc 2340 
tctcacccct ggcactgttg tttatggccc acctcctgct ggggccccca tggtgtatgg 2400 
gcctccaccc cccaacttct ccatcccctt catccctatg ggtgtgctgc attgcaacgt 2460 
ccctgaacac cataacttag agaatgaagt ttctagatta gaagacataa tgcagcattt 2520 
aaaatcaaag aagcgggaag aaaggtggat gagagcatcc aagcggcagt cggagaaaga 2580 
aatggaagaa ctgcatcata atattgatga tcttttgcaa gagaagaaaa gcttagagtg 2640 
tgaagtagaa gaattacata gaactgtcca gaaacgtcaa cagcaaaagg acttcattga 2700 
tggaaatgta gagagtctta tgactgaact agaaatagaa aaatcactca aacatcatga 2760 
agatattgta gatgaaattg agtgcattga gaagactctt ctgaaacgtc gctcagagct 2820 
cagggaagct gaccgactcc tggcagaggc tgagagtgaa ctttcatgca ctaaagaaaa 2880 
gacaaaaaat gctgttgaaa agttcactga tgccaagaga agtttattgc aaactgagtc 2940 
agatgctgag gaattagaaa ggagagctca ggaaactgct gttaacctcg tcaaagctga 3000 
tcagcagcta agatcgctcc aggctgatgc aaaggatttg gagcagcaca aaatcaagca 3060 
agaagaaatc ttgaaagaaa taaacaaaat tgtagcagca aaagactcag acttccaatg 3120 
tttaagcaag aagaaggaaa aactgacaga agagcttcag aaactacaga aagacataga 3180 
gatggcagaa cgcaatgagg atcaccacct gcaggtcctt aaagaatctg aggtgcttct 3240 
tcaggccaaa agagccgagc tggaaaagct gaaaagccag gtgacaagtc agcagcagga 3300 
gatggctgtc ttggacaggc agttagggca taaaaaggag gagctgcatc tactccaagg 3360 
aagcatggtc caggcaaaag ctgacctcca ggaagctctg agactgggag agactgaagt 3420 
aactgagaag tgcaatcaca ttagggaagt aaaatctctt ctggaagaac tgagttttca 3480 
gaaaggagaa ctaaatgttc agattagtga aagaaaaact caacttacac ttataaagca 3540 
ggaaattgaa aaagaggaag aaaatcttca ggttgtttta aggcagatgt ctaaacataa 3600 
aaccgaacta aagaatattc tggacatgtt gcaacttgaa aaccatgagc tacaaggttt 3660 
gaagctacaa catgaccaaa gggtatctga attagagaag actcaggtgg cagtgctaga 3720 
ggagaaactg gagttagaga atttgcagca gatatcccag cagcagaaag gggaaataga 3780 
gtggcagaag cagctccttg agagggataa acgagaaata gaacgaatga ctgctgagtc 3840 
ccgagcttta caatcgtgtg ttgagtgttt gagcaaagaa aaggaagatc tccaagagaa 3900 
atgtgacafct tgggaaaaaa agttggcaca aaccaaaagg gttttagcag cagcagaaga 3960 
aaatagcaaa atggagcaat caaacttaga aaagttggaa ttgaatgtca gaaaactgca 4020 
gcaggaacta gaccaactaa acagagacaa gttgtcactg cataacgaca tttcagcaat 4080 
gcaacagcag ctccaagaaa aacgagaagc agtaaactca ctgcaggagg aactagctaa 4140 
tgtccaagac catttgaacc tagcaaaaca ggacctgctt cacaccacca agcatcagga 4200 
tgtgttgctc agtgagcaga cccgactcca gaaggacatc agtgaatggg caaataggtt 42 60 
tgaagactgt cagaaagaag aggagacaaa acaacaacaa cttcaagtgc ttcagaatga 4320 
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gattgaagaa aacaagctca aactagtcca acaagaaatg atgfcttcaga gactccagaa 4380 
agagagagaa agtgaagaaa gcaaattaga aaccagtaaa gtgacactga aggagcaaca 4440 
gcaccagctg gaaaaggaat taacagacca gaaaagcaaa ctggaccaag tgctctcaaa 4500 
ggtgctggca gctgaagagc gtgttaggac tctgcaggaa gaggagaggt ggtgtgagag 4560 
cctggagaag acactctccc aaactaaacg gcagctttca gaaagggagc agcaattggt 4620 
ggagaaatca ggtgagctgt tggccctcca gaaagaggca gattctatga gggcagactt 4680 
cagccttctg cggaaccagt tcttgacaga aagaaagaaa gctgagaagc aggtggccag 4740 
cctgaaggaa gcacttaaga tccagcggag ccagctggag aaaaaccttc ttatggcaaa 4800 
ccaaaaagat ttggagagaa gacaaatgga aatcagtgat gcaatgagga cacttaaatc 4860 
tgaggtgaag gatgaaatca gaaccagctt gaagaatctt aatcagtttc ttccagaact 4920 
accagcagat ctagaagcta ttttggaaag aaacgaaaac ctagaaggag aattggaaag 4980 
cttgaaagag aaccttccat ttaccatgaa tgagggacct tttgaagaaa aactgaactt 5040 
ttcccaagtt cacataatgg atgaacactg gcgtggagaa gcactccggg agaaactgcg 5100 
tcaccgggaa gaccgactca aggcccaact ccgacactgt atgtccaagc aagcagaagt 5160 
attaattaaa ggaaagcggc agacagaggg cactttacac agtttgagga gacaagtaga 5220 
tgctttaggg gaattggtca ccagcacctc tgcagattca gcgtcatcac ccagtctgtc 5280 
tcagctggag tcttccctca cagaggactc tcaacttgga caaaatcagg aaaagaatgc 5340 
ctcagccaga tgaggaatac tgtcttgtgt caaatatatt caaggaaaac actccaatac 5400 
tcactgactt cataattggc atgtcccatg gtttttttta atcaagatgc agtgaactga 5460 
gaattcggaa actccactgg tagttactgt ggcctgtacc atttaatgcc aaggttttat 5520 
aaatcactgg tgctagtaca tttgggacca acgtgctatg gggaattaac caacatgggg 5580 
tgaagctttt tttaaacctc gtacaggttt taaaacaaag cgggagtttt acggggtatt 5640 
ttaaacgtca aaaggggagc cttttttcaa tagaaggtgt gttgagggca gcttatatag 5700 
aggggagcaa catgaggatg tgggtgacaa acttcttcct gaagaagaag cagaccacaa 5760 
actttgggcg cgcagagaaa gggagaacaa caatatagga ccagggcgag gtagtatata 5820 
ataatgttat catcacacat cgacaaagag aggaataaga tatgatagat ataaaaataa 5880 
tgataaatga cggggggaga aa 5902 

<210> 50 
<211> 2687 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 1482140CB1 



<400> 50 

ggagcggcga ctggcgagcc atggcgctgg 
gtggagtcat ctcaccgccg tgcgcaccca 
ccgcatcccc tcctgcaccc cctggatggc 
cgaccacgga gctgcgcaag gaaaagtccc 
agaccgaggt gctgtaccag ctggctcaca 
acctggacaa ggcctctatc atgcgcctca 
gcgccgcagg ggagtggaac caggtgggag 
aggccctgga gggcttcgtc atggtgctca 
agaatgtcag caaacacctg ggcctcagtc 
atttcatcca cccctgtgac caagaggagc 
tgtccaggag gaaggtggag gcccccacgg 
cgctcaccag ccgcgggcgc accctcaacc 
gctctggaca tatgagggcc tacaagccac 
actcagagcc cccgctgcag tgcctggtgc 
gcctggagcc cccactgggc cgaggggcct 
tcacctactg tgacgacagg attgcagaag 
gctgttccgc ctacgagtac atccacgcgc 
acaccttgct gagcaagggc caggcagtaa 



ggctgcagcg cgcaaggccg gccctttcct 60 
ctcgtaactc gcacccgggt cctggctgca 120 
ccttcagcca acgggggcct gggcgatggt 180 
gggatgcggc ccgcagccgg cgcagccagg 240 
cgctgccctt cgcccgcggc gtcagcgccc 300 
ccatcagcta cctgcgcatg caccgcctct 360 
caggggagaa ccactggatg ctgctactga 420 
ccgccgaggg agacatggct tacctgtcgg 480 
agctggagct cattggacac agcatctttg 540 
ttcaggacgc cctgaccccc cagcagaccc 600 
agcggtgctt ctccttgcgc atgaagagta 660 
tcaaggcggc cacctggaag gtgctgaact 720 
ctgcgcagac ttctccagct gggagccctg 780 
tcatctgcga agccatcccc cacccaggca 840 
tcctcagccg ccacagcctg gacatgaagt 900 
tggctggcta tagtcccgat gacctgatcg 960 
tggactccga cgcggtcagc aagagcatcc 1020 
cagggcagta tcgcttcctg gcccggagtg 1080 
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gtggctacct gtggacccag acccaggcca cagtggtgtc agggggacgg ggcccccagt 1140 
cggagagtat cgtctgtgtc cattttttaa tcagccaggt ggaagagacc ggagtggtgc 1200 
tgtccctgga gcaaacggag caacactctc gcagacccat tcagcggggc gccccctctc 1260 
agaaggacac ccctaaccct ggggacagcc ttgacacccc tggcccccgg atccttgcct 1320 
tcctgcaccc gccttccctg agcgaggctg ccctggccgc tgacccccgc cgtttctgca 13 80 
gccctgacct ccgtcgcctc ctgggaccca tcctggatgg ggcttcagta gcagccactc 1440 
ccagcacccc gctggccaca cggcaccccc aaagtcctct ttcggctgat ctcccagatg 1500 
aactacctgt gggcaccgag aatgtgcaca gactcttcac ctccgggaaa gacactgagg 1560 
cagtggagac agatttagat atagctcagg atgctgatgc tctggatttg gagatgctgg 1620 
ccccctacat ctccatggat gatgacttcc agctcaacgc cagcgagcag ctacccaggg 1680 
cctaccacag acctctgggg gctgtccccc ggccccgtgc tcggagcttc catggcctgt 1740 
cacctccagc ccttgagccc tccctgctac cccgctgggg gagtgacccc cggctgagct 1800 
gctccagccc ttccagaggg gacccctcag catcctctcc catggctggg gctcggaaga 1860 
ggaccctggc ccagagctca gaggacgagg acgagggagt ggagctgctg ggagtgagac 1920 
ctcccaaaag gtcccccagc ccagaacacg aaaactttct gctctttcct ctcagcctga 1980 
gtttccttct gacaggagga ccagccccag ggagcctgca ggaccccact gaacttaccc 2040 
aattccttct ttcagtctta agttttccca ttctagaccc ctaccctcta ggctgtgctg 2100 
ctcctggact tcatgcctct ccattctcat tgcctacaat ctctgtgccc cagaaccccc 2160 
tccactcccc accccagccc tccagacatg cacttacctt gactttaccc cacatgtttg 2220 
gggcacctgg ggctccctca ccccttgggt ggtttgcaat ctgaagactt ctccagccac 2280 
acaggcacat gcacaggcac ggtgctgtct gcatattgcc aggtggggag agaagccagg 2340 
acccctcagc tgtctgccac catctatgtg cctcccttac cccccagctt tctttctaca 2400 
gatggtgcta ctcttggtct cccacaggaa aaggcctccc cccttcttag ccccatttac 2460 
cccgtttgtg gaaggcactg ctcgctctgt tttgtcagag agtggcctat ccagattggt 2520 
gctatggggg ggtctgaccc ctccctcctc cctctggagg tgatgtgggc cctcaatgga 2580 
gggaattgtg ctgggctagg gaaaggggag ggactagact ggccacactg gctctgaaac 2640 
tcaccaacct ctatacacca taaagacctc accttggtag gcaccag 2687 

<210> 51 

<211> 8280 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 394992CB1 

<400> 51 

atggtctccg aggaggagga ggaggaggac ggcgacgccg aggagaccca ggattctgag 60 
gacgacgagg aggatgagat ggaagaggac gacgatgact ccgattatcc ggaggagatg 120 
gaagacgacg acgacgacgc cagttactgc acggaaagca gcttcaggag ccatagtacc 180 
tacagcagca ctccaggtag gcgaaaacca agagtacatc ggcctcgttc tcctatattg 240 
gaagaaaaag acatcccgcc ccttgaattt cccaagtcct ctgaggattt aatggtgcct 300 
aatgagcata taatgaatgt cattgccatt tacgaggtac tgcggaactt tggcactgtt 360 
ttgagattat ctccttttcg ctttgaggac ttttgtgcag ctctggtgag ccaagagcag 420 
tgcacactca tggcagagat gcatgttgtg cttttgaaag cagttctgcg tgaagaagac 480 
acttccaata ctacctttgg acctgctgat ctgaaagata gcgttaattc cacactgtat 540 
ttcatagatg ggatgacgtg gccagaggtg ctgcgggtgt actgtgagag tgataaggag 600 
taccatcacg ttcttcctta ccaagaggca gaggactacc catatggacc agtagagaac 660 
aagatcaaag ttctacagtt tctagtcgat cagtttctta caacaaatat tgctcgagag 720 
gaattgatgt ctgaaggggt gatacagtat gatgaccatt gtagggtttg tcacaaactt 780 
ggggatttgc tttgctgtga gacatgttca gcagtatacc atttggaatg tgtgaagcca 840 
cctcttgagg aggtgccaga ggacgagtgg cagtgtgaag tctgtgtagc acacaaggtg 900 
cctggtgtga ctgactgtgt tgctgaaatc caaaaaaata aaccatatat tcgacatgaa 960 
cctattggat atgatagaag tcggaggaaa tactggttct tgaaccgaag actcataata 1020 
gaagaagata cagaaaatga aaatgaaaag aaaatttggt attacagcac aaaggtccaa 1080 
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cttgcagaat taattgactg tctagacaaa gattattggg aagcagaact ctgcaaaatt 1140 
ctagaagaaa tgcgtgaaga aatccaccga cacatggaca taactgaaga cctgaccaat 1200 
aaggctcggg gcagtaacaa atcctttctg gcggcagcta atgaagaaat tttggaatcc 1260 
ataagagcca aaaagggaga cattgataat gttaaaagcc cagaagaaac agaaaaagac 1320 
aagaatgaga ctgagaatga ctctaaagat gctgagaaaa acagagaaga atttgaagac 1380 
cagtcccttg aaaaagacag tgacgacaaa acaccagatg atgaccctga gcaaggaaaa 1440 
tctgaggagc caacagaagt tggggataaa ggtaactctg tgtcagcaaa tcttggcgac 1500 
aacacaacaa atgcaacttc agaagagact agtccctctg aagggaggag ccctgtgggg 1560 
tgtctctcag aaacccccga tagcagcaac atggcagaga agaaggtggc atctgagctc 1620 
ccccaggatg tgccagtagg tgatttcaaa tcggagaagt ccaacgggga gctaagtgaa 16 80 
tctcctggag ctggaaaagg agcatctggc tcaactcgaa tcatcaccag attgcggaat 1740 
ccagatagca aacttagtca gctgaagagc cagcaggtgg cagccgctgc acatgaagca 1800 
aataaattat ttaaggaggg caaagaggta ctggtagtta actctcaagg agaaatttca 1860 
cggtfcgagca ccaaaaagga agtgatcatg aaaggaaata tcaacaatta ttttaaattg 1920 
ggtcaagaag ggaagtatcg cgtctaccac aatcaatact ccaccaattc atttgctttg 19 80 
aataagcacc agcacagaga agaccatgat aagagaaggc atcttgcaca taagttctgt 2040 
ctgactccag caggagagtt caaatggaac ggttctgtcc atgggtccaa agttcttacc 2100 
atatctactc tgagactgac tatcacccaa ttagaaaaca acatcccttc atcctttctt 2160 
catcccaact gggcatcaca tagggcaaat tggatcaagg cagttcagat gtgtagcaaa 2220 
cccagagaat ttgcattggc tttagccatt ttggagtgfcg cagttaaacc agttgtgatg 2280 
ctaccaatat ggcgagaatc tttaggacat accaggttac accggatgac atcaattgaa 2340 
agagaagaaa aggagaaagt caaaaaaaaa gagaagaaac aggaagaaga agaaacgatg 2400 
cagcaagcga catgggtaaa atacacattt ccagttaagc atcaggtttg gaaacaaaaa 2460 
ggtgaagagt acagagtgac aggatafcggt ggttggagct ggattagtaa aactcatgtt 2520 
tataggtttg ttcctaaatt gccaggcaat actaatgtga attacagaaa gtcgttagaa 2580 
ggaaccaaaa ataatatgga tgaaaatatg gatgagtcag ataaaagaaa atgttcacga 2640 
agtccaaaaa aaataaaaat agagcctgat tctgaaaaag atgaggtaaa aggttcagat 2700 
ictgcaaaag gagcagacca aaatgaaatg gatatctcaa agattactga gaagaaggac 2760 
caagatgtga aggagctctt agattctgac agtgataaac cctgcaagga agaaccaatg 2820 
gaagtagacg atgacatgaa aacagagtca catgtaaatt gtcaggagag ttctcaagta 2880 
gatgtggtca atgttagtga gggttttcat ctaaggacta gttacaaaaa gaaaacaaaa 2940 
tcatccaaac tagatggact tcttgaaagg agaattaaac agtttacact ggaagaaaaa 3000 
cagcgactcg aaaaaatcaa gttggagggt ggaattaagg gtataggaaa gacttctaca 3 060 
aattcttcaa aaaatctctc tgaatcacca gtaataacga aagcaaaaga agggtgtcag 3120 
agtgactcga tgagacaaga acagagccca aatgcaaata atgatcaacc tgaggacttg 3180 
attcagggat gttcagaaag tgattcctca gttcttagaa tgagtgatcc tagtcatacc 3240 
acaaacaaac tttatccaaa agatcgagtg ttagatgatg tctccattcg gagcccagaa 3300 
acaaaatgtc cgaaacaaaa ttccattgaa aatgacatag aagaaaaagt ctctgacctt 3360 
gccagtagag gccaggaacc cagtaagagt aaaacaaaag gaaatgattt tttcatcgat 3420 
gactctaaac tagccagtgc agatgatatt ggtactttga tctgtaagaa caaaaaaccg 3480 
ctcatacagg aggaaagtga caccattgtt tcttcttcca agagtgcttt acattcatca 3540 
gtgcctaaaa gtaccaatga cagagatgcc acacctctgt caagagcaat ggactttgaa 3600 
ggaaaactgg gatgtgactc tgaatctaat agcactttgg aaaatagttc tgataccgtg 3660 
tctattcagg atagcagtga agaagatatg attgttcaga atagcaatga aagcatttct 3720 
gaacagttca gaactcgaga acaagatgtt gaagtcttgg agccgttaaa gtgtgagttg 3780 
gtttctggtg agtccactgg aaactgtgag gacaggctgc cggtcaaggg gactgaagca 3840 
aatggtaaaa aaccaagtca gcagaagaaa ttagaggaga gaccagttaa taaatgtagt 3900 
gatcaaataa agctaaaaaa taccactgac aaaaagaata atgaaaatcg agagtctgaa 3960 
aagaaaggac agagaacaag tacatttcaa ataaatggaa aagataataa acccaaaata 4020 
tatttgaaag gtgaatgctt gaaagaaatt tctgagagta gagtagtaag tggtaatgtt 4080 
gaaccaaagg ttaataatat aaataaaata atccctgaga atgatattaa atcattgact 4140 
gttaaagaat ctgctataag gccattcatt aatggtgatg tcatcatgga agattttaat 4200 
gaaagaaaca gctccgaaac aaaatcgcat ttgctgagtt cttcagatgc tgaaggtaac 4260 
taccgagata gccttgagac cctgccatca accaaagagt ctgacagtac acagacgacc 4320 
acaccctcag catcttgtcc agaaagcaat tcagttaatc aggtagaaga tatggaaata 4380 
gaaacctcag aagttaagaa agttacttca tcacctatta cttctgaaga ggaatctaat 4440 
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ctcagtaatg acttfcattga tgaaaatggt 
ggagaatcta aaagaaaaac cgtcatcaca 
acagaatcaa aaactgtgat caaggtagaa 
acagaaaatt gtgcaaaatc cactgtcaca 
acaccctcca caggcggcag tgtggacatc 
gtcaccacga cagtgacaga ctccctgacc 
actgtgagca aagagtattc cacacgagac 
aagaagactc gttcaggtac agctctgcca 
aagaagagca tttttgtttt gcctaatgat 
atccgagagg tcccttattt taattacaat 
ccttctccta gaccgacctt tggcatcact 
ttagctggag tgagcctgat gttacggtta 
gcggccaagg ctcctccagg aggagggact 
acaacaacag aaataattaa gaggagagat 
tgtatcagga aaatcatttg tcccattgga 
cctcagagga aaggccttcg atcaagtgca 
caaactggcc ctgttattat tgaaacctgg 
atcagggcat ttgctgagag agtggagaaa 
aagaaacgac tggagcagca gaagccgaca 
agcagtacaa ccagcaccat ctctccagca 
tcagttacaa ctggaaccaa aatggtacta 
acattccaac aaaacaagaa ctttcatcaa 
tcaaattcag gcgttgttca agtacagcag 
ggtaccagtc agcaaacctt tacttcattc 
cccaatacct caggctctgg aggaaccaca 
attcgccctg gtatgaccgt gattagaaca 
attattcgaa cacctgtgat ggtacagcca 
atcagggggc agcctgtctc cactgcagtc 
gggcagaaaa gcttaacttc agcaacgtcc 
ccccctcgcc cccaacaagg acaagtgaag 
cagggccacg gtggcaatca aggtttgaca 
ggacagttgc agttgatacc tcaaggggtg 
atgcaagctg caatgccaaa tggtactgtt 
acagccacca cagccagcac caccaccacc 
gaacaaaggc agagtaaact gtcaccccag 
ccagctcagt catcaagtgt gggtccagca 
gctcagcccc agccccaaac ccagccccag 
cctgaagttc agacccaaac aactgtttca 
cacgcacagt catccaagcc ccaagttgca 
ggacagtctc ctgttcgtgt ccaaagtcca 
tcccaactgt ctcctggaca acaatcccag 
attcaaccac atacatctct tcagatacct 
gtggtgatga agcataatgc tgtaatagaa 
gctgaaagag aagagaatca aagaatgatt 
gataagatag ataaagaaga aaaacaggca 
gagcagaaac gtagcaagca gaatgccact 
gagcagctca gagccgagat cctgaagaag 
gaagtgcagg aagagctgaa gagagacctg 
ttggctcagg ccacagcagt agctgcaccc 
cctccagccc ctccaccttc acctccccct 
tccacgccca ccttacctgc tgcttcccag 
agctcaaagt ccaagaaaaa gaaaatgatc 
acaaagcttt actgtatctg taaaacgcct 
gatctttgta ctaactggta tcatggagaa 
aaaatggatg tgtacatctg taatgattgt 
ttgtactgta tctgcagaac accttatgat 



ctgcccatca acaaaaatga aaatgtcaat 4500 
gaagtcacca cgatgacctc cacagtggcc 4560 
aaaggcgata agcaaactgt ggtttcttcc 4620 
accaccacta caacagtgac caagctttcc 4680 
atctctgtaa aggagcagag caaaaccgtg 4740 
accacgggag gcacactggt tacatctatg 4800 
aaagtgaaac tgatgaaatt ttcaagacca 4860 
tcctatagaa aatttgttac caagagcagc 4920 
gacttaaaaa agttggcccg aaaaggagga 4980 
gcaaaacctg ctttggatat atggccatat 5040 
fcggaggtata gacttcagac agtaaagtcc 5100 
ctgtgggcaa gtttgagatg ggatgatatg 5160 
acacggacag aaacatccga aactgaaatc 5220 
gttggtcctt atggcattcg atctgaatat 5280 
gttccagaaa caccaaaaga aacgcctaca 5340 
ctgcggccaa agagaccaga aacgcccaag 5400 
gtagcagaag aagaactgga attgtgggag 5460 
gaaaaggcac aagcagttga gcaacaggct 5520 
gtgattgcaa cttccactac ttccccaaca 5580 
cagaaggtta tggtggcccc cataagtggc 5640 
actactaaag ttggatctcc agctacagta 5700 
acctttgcta catgggttaa gcaaggccag 5760 
aaagtcctgg gtatcattcc atcaagtaca 5820 
cagcccagga cagcaacagt cacaattagg 5880 
agcaattcac aagtaatcac agggcctcag 5940 
ccactccaac agtcaacact aggaaaggca 6000 
ggtgctcctc agcaagtgat gactcaaatc 606 0 
tccgccccta acacggtttc ctcaacacct 6120 
acttcaaata tacagtcttc agcctcacaa 6180 
ctcaccatgg ctcaacttac tcagttaaca 6240 
gtagtaattc aaggacaagg tcaaactact 6300 
actgtactcc caggcccagg ccagcagcta 6360 
cagcgattcc tctttacccc attggcaaca 6420 
actgtttcca cgacagcagc aggtacaggt 6480 
atgcaggtac atcaagacaa aaccctgcca 6540 
gaagcccagc cacagactgc tcagccttca 6600 
tccccagctc agcctgaagt tcagactcag 666 0 
tcccatgtcc cttctgaagc acaacccacc 6720 
gcacagtctc agcctcaaag taatgtccaa 6780 
tcacagactc gaatacgtcc atcaactcca 6840 
gttcagacta caacctcaca accgattcca 6900 
tcccaaggcc agccacagtc acaaccccag 6960 
catttaaaac agaaaaagag catgactcca 7020 
gtctgtaacc aggtgatgaa gtatattttg 7080 
gcaaaaaaac ggaagcgtga agagagtgtg 7140 
aagctgtcag ctctgctctt caagcacaaa 7200 
agagcactcc tggacaagga tctgcaaatt 7260 
aaaattaaga aagaaaaaga cctgatgcag 7320 
tgccccccag tgacaccagc tcctccagcc 7380 
ccacctgctg tgcaacacac aggccttctg 7440 
aagaggaagc gggaagagga aaaagactcc 7500 
tctactacct caaaggaaac taagaaggac 7560 
tatgatgaat ctaagttcta tattggctgt 7620 
tgtgttggca tcacagaaaa ggaggctaag 7680 
aaacgggcac aagagggcag cagtgaggaa 7740 
gagtcacaat tttatattgg ctgtgatcgg 7800 
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tgtcagaatt ggtaccatgg 
gatgagtatg tctgtccaca 
ctaacagaga aggattatga 
atggcctggc ctttccttga 
attaaggaac ctatggacct 
aagctgacgg aatttgtggc 
ccaagtgact ccccatttta 
ttgaaaggct tcaaagctag 

<210> 52 
<211> 2597 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_f eature 
<223> Incyte ID No: 5093550CB1 

<400> 52 

cctgcaagca ttgggagatc cacagctaag acgccagggc ctcccctgga agcctagaaa 60 
tgggaccatt gacatttatg gatgtggcca tagaattctc tgtggaggag tggcaatgcc 120 
tggacactgc acagcagaat ttatatagga acgtgatgtt agagaactac agaaacctgg 180 
tcttcctggg tattgctgtc tctaagccag acctgatcac ctgtctggag caaggcaaag 240 
agccctggaa tatggagaga catgagatgg tggccaaacc cccaggtatg tgttgttatt 300 
ttgcccaaga ccttcggcca gagcagagca taaaagcttc tttgcaaaga ataatactga 360 
gaaaatatga aaaatgtgga catcacaatt tacagttaaa aaaaggctat aaaagtgtgg 420 
atgagtacaa ggtgcacaaa ggaagttata atggatttaa ccagtgtttg acaactaccc 480 
agaguaaaat atttcagtgt gataaatatg tgaaagactt tcataaattt tcaaattcaa 540 
atagacataa gactgaaaag aatcctttca aatgtaaaga atgtggcaag tcattttgtg 600 
ttctttcaca cctaactcaa cataaaagaa ttcacactac tgtcaattcc tacaaacttg 660 
aagaatgtgg caaggccttt aatgtgtcct caaccctttc tcaacataag agaattcata 720 - 
caggacaaaa acactacaaa tgtgaagaat gtggcatagc ctttaacaag tcctcacacc 780 
ttaacacaca taagataatt catactggag agaaatccta caaacgtgaa gaatgtggaa 840 
aagcttttaa catatcctca caccttacta cacataagat aattcatact ggagagaatg 900 
cctacaaatg taaagaatgt ggcaaagctt ttaaccaatc atcaaccctt actagacata 960 
agataattca tgctggagag aagccctaca tatgtgaaca ttgtggcaga gcttttaacc 1020 
aatcctcgaa ccttactaaa cataagagaa ttcatactgg tgataaacct tataaatgtg 10 80 
aagaatgtgg caaagccttt aatgtgtcct caacccttac tcaacataag agaattcata 1140 
ctggagagaa accttacaaa tgtgaagagt gtggcaaagc ctttaacgtg tcctcaactc 1200 
ttactcaaca taagagaatt catactggag aaaaaccata caaatgtgaa gaatgtggca 12 60 
aagcctttaa cacatcctca cacctcacca cacataaaag aattcatacc ggagagaaac 13 20 
cctacaaatg tgaagaatgt ggcaaagcct ttaaccagtt ctcacaactt actacacafca 13 80 
agataattca tactggagag aaaccctaca aatgtaaaga atgtggcaaa gcttttaagc 1440 
ggtcctcaaa ccttactgaa cataggataa ttcatactgg agagaaaccc tacaaatgtg 1500 
aagaatgtgg caaagctttt aacctatcct cacaccttac aacacataag aaaattcata 1560 
ctggagagaa accctacaaa tgtaaagaat gtggcaaagc ttttaaccaa tcctcgacac 1620 
ttgctagaca taagataatt catgctggag agaaacccta caaatgtgaa gaatgtggca 1680 
aagcttttta ccaatactca aaccttactc aacataagat aattcatact ggagagaaac 1740 
cctacaaatg tgaagaatgt ggcaaagcct ttaattggtc ctcaactctg actaaacata 1800 
aggtaattca tactggagag aaaccctaca aatgtaaaga atgtggcaaa gcttttaacc 1860 
aatgctcaaa ccttactaca cacaagaaaa ttcatgctgt agaaaaatct gacaaataag 1920 
aaaaatggac caggcgtggt ggctcacgcc tgtaatccca gcactttggg aggccgaggc 1980 
gggtggatca caagatcaag agattgagac catcctggcc aacgaagaga attactggag 2040 
agaaaccgta caaacctgaa agatataaca gtgcagttga caacacctca atatttctaa 2100 
acacctttaa atggttgtca tacttgattg taggtaaaat atagtagagg aaacctctac 2160 
aagtgtgaag aatgtggcaa aacttttaac caatgctgac actttattgg acaggaaatg 2220 



gcgctgcgtt ggcatcttgc aaagtgaggc agagctcatt 7860 
gtgccagtca acagaggatg ccatgacagt gctcacgcca 7920 
ggggttgaag agggtgctcc gttccttaca ggcccataag 7980 
accagtagac cctaatgatg caccagatta ttatggtgtt 8040 
tgccaccatg gaagaaagag tacaaagacg atattatgaa 8100 
agatatgacc aaaatttttg ataactgtcg ttactacaat 8160 
ccagtgtgca gaagttctcg aatcattctt tgtacagaaa 8220 
caggtctcat aacaacaaac tgcagtctac agcttcttaa 8280 
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atgtatacfct gagaaaaatt atgcaaatgt 
tcacatctta ttcaatatca gagagttcat 
tgtcaataga tttttcagaa aatataagcc 
gacaagcatt aaaatatcaa gagtgtaagc 
gcacttggag gccggggcag gtggtttact 
acatggtgaa actccatctc tacccaggat 
actaatccca gctactc 



aaagaatgtg aagaagccat taatatctgt 2280 
acttaataga agcattaaag atgaaattac 2340 
tttaaagtga agaagagtat ttattttgaa 2400 
tgggcatggt ggctcaggcc tgtaatttca 2460 
tgaggtcaga ggttcagacc agcctggcca 2520 
gcagaaatta ggtgggtgtg gtggtgcatg 2580 

2597 



<210> 53 
<211> 725 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> mi sc__ feature 

<223> Incyte ID No: 7487977CB1 

<400> 53 

gcatgtccaa gccggtggac cacgtcaagc ggcccatgaa cgccttcatg gtgtggtcgc 60 

gggctcagcg gcgcaagatg gcccaggaaa accccaagat gcacaactcg gagatcagca 120 

aacgcctagg tgccgaatgg aagcttctgt ccgaggcaga gaagcggcca tacatcgatg 180 

aagccaagcg gctacgcgcc cagcacatga aggagcaccc tgactacaag taccgacctc 240 

ggcgcaagcc caagaacctg ctcaagaagg acaggtatgt cttccccttg ccctacctgg 300 

gcgacacgga cccgctcaag gcggctggcc tgcccgtggg ggcctccgac ggcctcctga 360 

gcgcgcccga gaaagcccgg gccttcttgc cgccggcctc ggcgccctac tccctgctgg 420 

accccgcgca gtttagctcg agcgccatcc agaagatggg cgaagtgccc cacaccttgg 480 

ctaccggcgc tctgccctac gcgtccaccc tgggctacca gaacggcgcc ttcggcagcc 540 

tcagctgccc cagccagcac acgcacacgc acccgtcccc caccaaccct ggctacgtgg 600 

tgccctgtaa ctgtaccgcc tggtctgcct ccaccctgca gccccccgtc gcctacatcc 660 

tcttcccagg catgaccaag actggcatag acccttattc gtcagcccac gctacggcca 720 
tgtaa 725 



<210> 54 
<211> 1952 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 1706514CB1 



<400> 54 

gcggcccggg agtacctgta cctttcagct 
cttggagagc ccaggagcag gggagacatg 
gacttcaccc tggaggagtg ggctttgctg 
gtgatgctgg agaccttccg gaacctggcc 
aatgggtcag tttctctgca ggatatgtac 
ccaaacttca caggaaataa ttcctgtgcc 
ggcactgaag accaccacaa aaatctgaga 
aatgaaggta atcaatatgg agaagccatc 
aaagtttctg ctggagaaaa accatatgaa 
ctttcttctc ttaaaaggca cgtcaagtct 
gaatgtaagc aggcctgcat ttgtccctca 
gaggagaagc cgtataagtg tcaagcatgt 
tcccaccacg taaagactca cacagcagag 
gcgtttaatg ggttcgcaag cttcactaga 



gcgccggccg cgaggccacg gagagctcgc 60 
gactcagtgg tctttgagga tgtggctgtg 120 
gattctgctc agagggacct ctacagagat 180 
tcagtagatg atgggactca atttaaagcc 240 
gggcaagaaa aatctaagga acagacaata 300 
tacactttag aaaaaaattg tgaaggctat 360 
aatcatatgg tggacagatt ctgtacacat 420 
catcaaatgc cagatcttac cctgcacaag 480 
tgcaccaagt gcaggacagt cttcacgcat 540 
cactgtggac gaaaagcacc tccaggtgag 6 00 
cacctacaca gtcacggaag aaccgacact 660 
gggcaaactt tccaacatcc tcgttacctc 720 
aaaacctaca aatgcgagca gtgtcggatg 780 
catgtgagaa ctcacacaaa agacaggcca 840 
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tataaatgtc aggaatgtgg gagagccttc 
acaacacaca ct'ggagagaa gccctataaa 
ccccaggctt ttcaaagaca tgagaagacg 
cagtgtggga aaacattcag ttggtctgaa 
ggggacaaac tctataaatg tgaacactgt 
caaggtcatt tgaggacgca cactggagag 
gccttcactt ggtcctcaac gtttagagaa 
tataaatgtg aacaatgtgg gaaggctttt 
aggacgcaca ctggagagaa gccttatgag 
tcctcaacgt ttagagaaca tgtgagaatt 
cactgtggga aggcctttac ctcttccaga 
ggagagaagc cttatgagtg taaacaatgt 
cataatcatg tgaggatgca cactggagag 
tccttcaagt ggcactcctc cttccggaac 
cacgaatgtc agtcatactc aaaagccttc 
gagagcacac actaaagaga aattctataa 
gtataatgct ccagaaaatt cacaccagga 
gtcaatacct catttgtaaa acagacccat 
aaatgtttat gtgagaaaaa aaaaaaaaaa 



atttatccct cgacatttca aagacacatg 900 
tgtcagcact gtgggaaagc cttcacttac 960 
cacacgggag agaagcccta tgaatgcaag 1020 
accttgcgag tccacatgag gatccacacc 1080 
gggaaggctt ttacctcttc cagatcattc 1140 
aaaccttatg agtgtaaaca atgtgggaaa 1200 
catgtgagaa ttcacacgca agagcagctc 1260 
acctcttcca gatcattccg aggtcatttg 1320 
tgtaaacaat gtggaaaaac cttcacttgg 1380 
cacacgcaag agcagctcca taaatgtgaa 1440 
gcattccaag gtcatttgag gatgcacact 1500 
ggaaaaacct tcacttggtc ctcaacctta 1560 
aaacctcaca aatgtaaaca atgtgggatg 1620 
catctgagga tgcacacagg acagaaatcc 1680 
agttgccaag tcattctttc taaaaccagt 1740 
ctttaatggg gtaacctcac attaattcat 1800 
gagaaatctt acaagtatga tattgtcttt 1860 
tagtcgtgaa tttccagctc tttcaaaaat 1920 



<210> 55 
<211> 1213 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Ii.cyte ID No: 7488247CB1 

<400> 55 

gttttgctcc gcccggccga gctgcaggga 
ggatccgctg gcccgggccg ggggagccgg 
agccggcctg ccgccatggc cgaccacctg 
aggccgccgt ccgccgcggc cgcccatggc 
gcgggcccgg gcctggacag tgggctgagg 
ccccgccaac ccggggccct ggcgtacggg 
tttccggccg tgcctccgcc ggccgcgggc 
taccccggcc gcgcggccgc gccccccaac 
gcgccaagcg ccgcagcccc gccgccgccc 
ctcatcgacg aggaggcgct gacgtcgctg 
gagctgcccg agctcttcct gggccagagc 
gcgccgcccg ccggctccgt gagctgctga 
gaggagaaag ggcccgactg cccgccggac 
ccctccgcga gggtggaggc ggcggctgtg 
tggcgtccct ccaggccttg cctcctgcgg 
cagcctcggc cgtaaagtga aagagaccgg 
ggatcgtgtc ctctccccct cgccgccctc 
gcctgctgcc ccctctctcc tcgggatcgg 
caccccgctg cctgggccct gtggttgctg 
acagcccgaa cccgtggagc aatgccctgt 
cactttaaaa aaa 



ccgcacggtg cccgggtctc ccgccgcaga 60 
cgctgaccac cgcgcgctgt ccccgcaggc 120 
atgctcgccg agggctaccg cctggtgcag 180 
cctcatgcgc tccggactct gccgccgtac 240 
ccgcgggggg ctccgctggg gccgccgccg 300 
gccttcgggc cgccgtcctc cttccagccc 360 
atcgcgcacc tgcagcctgt ggcgacgccg 420 
gctccgggag gccccccggg cccgcagccg 480 
gcgcacgccc tgggcggcat ggacgccgaa 540 
gagctggagc tcgggctgca ccgcgtgcgc 600 
gagttcgact gcttctcgga cttggggtcc 660 
gggcggccgg cgcccgcccg gcgtgccgga 720 
cctgcaccca gcgactgggc cccgcgcgcg 780 
tgcgcagggc ccggcaccgg actgggaccc 840 
gaggacagtt tggcttcact tctctgaccc 900 
accagcttca gctttcggac tctggttctt 960 
ttcccccaat ctgagccatt gcaggcctct 1020 
gtccccagag ccaccatctc ctgagcctcc 1080 
ggcctcccac ctcaaggagg ggaaggttgt 1140 
ctggcctcca aaaccaaaat aaaactgggt 1200 

1213 



<210> 56 
<211> 2257 
<212> DNA 

<213> Homo sapiens 
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<220> 

<221> mis cofeature 

<223> Incyte ID No: 1427269CB1 

<220> 

<221> unsure 
<222> 2238 

<223> a, t, c, g, or other 
<400> 56 

gcggggcggt tatggcggct ccatatfcaac agcctcctcc tcctccgccg ccgccgccgt 60 
ctcctcctcc tcctcctttc cctcccgccc gcgctctaag ccatctccgc cttcaccctg 120 
acgcctgcct cttcccctca cctttccccc tcccctgttc taccatgccc ggcatgatgg 180 
agaaagggcc cgagttactg gggaagaacc gatcggccaa cggcagcgcc aagagcccgg 240 
caggcggcgg cggcagcggc gcctcgtcca ccaacggcgg gctgcactac tcagagcccg 300 
agagcggctg cagcagcgac gacgagcacg atgttgggat gagagtcgga gccgaatacc 360 
aagctcggat ccctgaattt gatccaggtg ctacaaagta cacagataaa gacaatggag 420 
ggatgcttgt atggtctcca tatcacagta tcccagatgc caaattggat gaatacattg 480 
caattgcaaa ggaaaagcat ggctacaatg tggaacaggc acttggcatg ttgttctggc 540 
ataaacataa cattgagaag tcccttgctg atctccctaa tttcactccc tttccggatg 600 
agtggacagt ggaagataaa gtcctatttg aacaagcctt fcagttttcat ggaaagagct 660 
ttcacaggat tcagcaaatg cttccagata agacaattgc aagccttgta aaatattact 720 
attcttggaa aaaaactcgc tctaggacaa gtttgatgga tcgccaggct cgtaaactag 780 
ctaatagaca taatcagggt gacagtgatg atgatgtaga agaaacacat ccaatggatg 840 
ggaatgatag tgattatgat cccaaaaaag aagccaaaaa agagggtaat actgaacaac 900 
ctgtccaaac tagcaagatt ggacttggaa gaagagagta tcagagttta caacatcgcc 960 
atcattctca gcgttctaag tgccgtccac ctaagggcat gtatttaacc caggaagatg 1020 
tggtagcagt ttcctgtagt cccaatgcag ccaacaccat cctgaggcaa ctggacatgg 1080 
agttgatctc tctaaaacgt caggttcaga atgctaagca agtaaacagt gcacttaaac 1140 
agaaaatgga aggtggaatt gaagaattca aacctcctga gtcaaatcag aaaattaatg 1200 
cccgttggac cacagaggag cagcttctag cagtgcaagg tgtccgcaaa tatggtaaag 1260 
attttcaagc tattgcagat gtaattggca acaagactgt tggccaagtg aagaacttct 1320 
ttgtaaacta caggcgtcgg tttaacttag aggaggtatt gcaggagtgg gaagcagaac 1380 
aaggaaccca ggcttctaat ggtgatgctt ctactttagg ggaggagaca aaaagtgctt 1440 
ctaatgtgcc atcagggaag agcactgatg aagaagagga ggcacagacc ccacaggctc 1500 
ctcggacact gggtccatca cctcctgccc catcatccac tccaacacca acagccccta 1560 
ttgccactct gaaccagcct ccaccacttc ttcgtccaac actgcctgct gccccggctc 1620 
ttcaccggca gcctcctcca ctccagcagc aggctcggtt catccagccc cggccaactt 1680 
taaatcagcc tccaccacct cttattcgcc ctgctaattc catgccaccc cgtctaaacc 1740 
caagaccggt gttgtccacg gttggtggtc aacagccacc atcacttatt ggaattcaga 1800 
cagattcaca gtcctcactg cactaaaaat taaattggac acagctgcag taacttttca 1860 
ccccatcatt ataccagtgc tcatctgact gatgaaaaag aggaaagaat aatcatttct 1920 
agatactgag gctgcgaact agttctgtgg cagtggacta gcataagtgg atgtctaaga 19 80 
aatttttcag ttcactagac taaaatgttt tacaacaaaa agcctccagt tagcctcctt 2040 
tctagagtat atgttcagca atgtgatctc ataaaaggaa aaaccaaaga tttagtgttc 2100 
tatataccaa ggttttggtt tgttttactg gatttattta atggaggtct ttattatcct 2160 
ggctctcata gtcaaggctc tagtacagga tattgctttt aggatgtgaa acctcttaag 2220 
gtcttaaggt aggatgtngg ctttttcttt aattttt: 2257 

<210> 57 
<211> 2415 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_f eature 
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<223> Incyte ID No: 103135CB1 
<400> 57 

gcgggcggcc tggacggcct ggaaggccag cgcgcaccac cgagacgtgg gctcctagag 60 
gggccggaag ctttcggata acaacttccg ctcgggaagt ttgtaaaagt ctgggctacc 120 
ggcgcggcgt agtggatgca gcatcctagt ggaggacgcc cctgtgatct gccctccttg 180 
gcactgtgct tccccagagg ggtggcctcg ctgttcccat ggacatggcc caggagccag 240 
tgaccttcag ggacgtggcc atctacttct caagggagga gtgggcgtgt ctggaaccca 300 
gccagagggc cctctaccgg gacgtgatgc tggacaactt cagcagtgtg gctgctctgg 360 
gattttgcag ccccagacca gacctcgtct ctcgcctgga acagtgggag gagccgtggg 420 
ttgaagaccg ggagagacct gagttccagg cagtgcagag gggaccccgg ccaggggcaa 480 
ggaagtctgc agaccccaag agacattgtg atcatccagc ttgggctcac aagaaaaccc 540 
acgtgcggcg agaaagagcc agggaaggaa gcagcttfcag gaagggcttc aggctggaca 600 
cggatgacgg gcagcttccc agagctgctc cagaaaggac agacgccaag cccacggctt 660 
tcccgtgtca ggtgctcacg cagcgttgtg ggcggcggcc gggccgcaga gagcgccgga 720 
agcagcgcgc agtagagctg tcgttcatct gcggcacgtg cgggaaggcg ctcagctgcc 
acagccggct gctcgctcac cagacggtgc acacgggaac caaggccttc gagtgccccg 
agtgcggcca gaccttccgg tgggcttcaa acctgcagcg ccaccagaag aaccacacgc 900 
gcgagaagcc cttctgctgc gaggcctgcg ggcaggcgtt cagcctgaag gaccgcctgg 960 
ctcagcaccg caaggtccac accgagcaca ggccctactc gtgtggcgac tgtgggaaag 1020 
ccttcaagca gaagtccaac cttctcagac accagctggt gcacaccggg gagcggccgfc 1080 
tctactgcgc ggactgcggc aaagccttcc ggaccaagga gaacctcagc caccaccaga 1140 
gggtccacag cggggagaag ccctacacct gtgccgagtg cggcaagtcc ttccggtggc 1200 
ccaagggctt cagcatccac cggaggctgc acctgacgaa gaggttctac gagtgcggcc 1260 
actgtgggaa aggcttccgt cacctggggt tcttcacgcg gcatcagagg actcacaggc 1320 
acggggaggt gtaggggcgc ccgaagcgtg gggtgctgcg cctctgcggg agtactgggt 1380 
cctgagggag agctgcagtg agaagttgct cttcagcctg gaaaatcaac ctgaattcag 1440 
agaagccttc t Lag tec tea gagctcccca gtcccccgag aagtttactg ggaaaactgc 1500 
caggtgggag aagcagagee atgggtacgc eggagatgge gggggctctg gagatggegg 1560 
gggctgcgcc ccggcgccgg gcatcctggg gatgtgctga gagtgtgcgc gaccccggag 1620 
ccacgtgcca ggccgggctc agaggeggag aagcctgcct ggtgcccaca gccgtctggc 1680 
tcagggactc caccctggcc ccgagttgcc gtctgctggg cctttccttc ctggctctgc 1740 
accccatgct ggctgcccgg tctggcttcc cttcttgtct ctgtcttggg cgaggcagct 1800 
gtgagcattg cacagaggca aagaccctcc tgcagcctct gcgctgggcc gtagaaacaa 1860 
gagectttgt aataccgaac ctcattcaag gattaggagt ggtggttagg tcagggccac 1920 
ccccagtgct geaggaaegg cctccaccca gctctgttgg teagagectg ggtcatgeae 1980 
ctggagttgg gagatcaagc tgggtctcag ggcagtgagg tggecatate caccacatcg 2040 
catttcgtgg gggaagaggt gacctctttg ttttaaactt aaggtgtctg cttatccagc 2100 
cagaaataaa aatctgccag tggtgttccc aagggaagac ccccgtggga atggatcggg 2160 
acactgccgt gtttcaggtt agecaattat ttttgctttt gcctgttttt ctttaagaat 2220 
tgeagecagg ccaggcgtgg tggctcacgc ctgaaatccc agetactegg gaggctgagg 2280 
caggagaatc gtttgaatcc gggaggtgga ggttgtggtg agecaaggtt gcgccattgc 2340 
actccagcct gggcagtaag agtgaaactc cgtctcaaaa aaaaattcca gecaacagat 2400 
gcatatgtaa tgttg 2415 

<210> 58 

<211> 4272 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 1907346CB1 

<400> 58 

tcaggctggg tctgtgtcct aaaggggagc tcagtccagc ccagctcccc actgetgeag 60 
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tgtgtgtggg cttctccagg aggagggtgg gttctgaaga aaggggtcag aaactcactt 120 
cttcctgtag gagtttctcg tccccgtatt ccccctggtg cagtggggcg cagaggacca 180 
tctttccaca gggtgagatc tgtgcatgtg actgtggcac atgaagggga tgtatagatt 240 
ctcagcccta agcagcatat tttggatatt cagtattgac ttctaaagac tcttttatgt 300 
tacccaagga agaagtctgg aagaagagga agaggaaaga aaaggagtca gggatggccc 360 
ttactcaggt acggttgaca tttagggatg tggccataga attctctcag gaggagtgga 420 
aatgcctgga ccctgctcag aggatcttat acagggacgt gatgttggag aactactgga 480 
accttgtttc tctgggactg tgtcattttg atatgaatat tatctccatg ttggaggaag 540 
ggaaagagcc ctggactgtg aagagctgtg tgaaaatagc aagaaaacca agaacgcggg 600 
aatgtgtcaa aggcgtggtc acagatatcc ctcctaaatg tacaatcaag gatttgctac 660 
caaaagagaa gagcagtaca gaagcagtat tccacacagt ggtgttggaa agacacgaaa 720 
gccctgacat tgaagacttt tccttcaagg aaccccagaa aaatgtgcat gattttgagt 780 
gtcaatggag agatgacaca ggaaattaca agggagtgct tatggcccag aaagaaggta 840 
aaagagatca acgcgacaga agagacatag aaaacaagct tatgaacaat cagcttggag 900 
taagctttca ttctcatctg cctgaactgc agctatttca aggtgagggg aaaatgtatg 960 
aatgtaatca agttgagaag tctaccaaca atggttcctc agtgtcacca cttcaacaaa 1020 
ttccttctag tgtccaaacc cacaggtcta aaaaatatca tgaacttaac catttttcat 1080 
tactcacaca aagacgaaaa gcaaacagtt gtggaaaacc ttataaatgt aatgaatgtg 1140 
gcaaggcgtt cactcagaat tcgaacctta caagtcatag gagaattcat agtggagaga 1200 
agccttacaa atgcagtgag tgcggcaaaa cctttactgt tcgttcaaat ctaactattc 1260 
atcaggtcat ccatactgga gaaaaacctt acaaatgtca tgagtgtggc aaggtcttca 1320 
ggcacaattc ataccttgca actcatcggc gaattcatac tggagagaaa ccttacaagt 13 80 
gtaatgagtg tggaaaagcc tttagaggac attcaaacct aactacccat cagttaattc 1440 
atactggaga aaaaccgttc aaatgtaatg aatgtggcaa gctcttcact caaaattcac 1500 
accttataag tcattggaga attcacactg gagagaaacc ttacaagtgc aatgagtgcg 1560 
gcaaagcctt tagtgttcgt tcaagcctag caatccatca gacaatccac actggagaaa 1620 
aaccttacaa atgtaatgaa tgtggcaaag tctttaggta caattcatac ctcggaaggc 1680 
atcggagagt tcatactggt gagaaacctt acaagtgtaa tgaatgtggc aaagccttca 1740 
gtatgcattc aaacctagct acccatcagg tcatccatac tggaacaaaa cctttcaaat 1800 
gcaatgaatg cagcaaggtt ttcactcaaa attcacaact tgcaaatcat cgaagaatgc 1860 
atactggaga gaaaacttac aagtgtaatg agtgtgggaa agccttcagt gttcgttcaa 1920 
gtctgactac ccatcaggca atccattctg gagagaaacc ttacaaatgt attgaatgtg 1980 
gcaagagctt cactcaaaaa tcacacctta gaagtcatcg gggaattcat tctggagaga 2040 
aaccttacaa gtgtaatgaa tgtggtaaag tcttcgctca aacatcacaa cttgcaaggc 2100 
attggagagt tcatactgga gaaaaacctt acaagtgtaa tgactgtggc agagccttta 2160 
gtgatcgttc aagcctaact tttcatcagg caatacatac tggagagaaa ccttacaaat 2220 
gtcatgaatg cggcaaggtt tttaggcaca attcatacct tgcaactcat cggcgaattc 2280 
atactggaga gaaaccttac aagtgtaatg agtgtgggaa agcctttagt atgcattcaa 2340 
acctaactac ccataaggtc atccatactg gagagaagcc ttacaaatgt aatcaatgtg 2400 
gcaaggtctt cactcagaac tcacaccttg caaatcatca aaggactcac accggagaga 2460 
aaccttaccg atgcaatgag tgtgggaaag ccttcagtgt tcgttcaagc ctaaccaccc 2520 
atcaggcaat ccatactggg aaaaaacctt acaaatgtaa tgaatgtggc aaggtcttta 2580 
ctcaaaatgc tcacctggca aatcaccgaa gaattcatac tggggagaaa ccttacaggt 2640 
gtacagagtg tgggaaagcc tttagggtaa gatcaagtct aactacccat atggcaatcc 27 00 
acactggaga aaagcgttac aaatgtaatg agtgtggcaa ggtcttcagg cagagttcaa 2760 
atcttgcaag tcatcacaga atgcataccg gagagaaacc ttacaaatga gtgtggtgag 2820 
gtcattaggt acaattcact cctttcacat cagttaattt cattcttgac agaatcctta 2880 
caaatgtagt gacagtggcc aatccctcat gagttgaagc attaatagat atgagaggcc 2940 
ataagcaaga gacatcatgt aaacatatgt ggcagagggt ctatccaggc ctcgcaggtt 3000 
actaggcatc aagatttata tctttgatga aacgaaacaa atgtaatatg catcctgagg 3060 
ccattaccca gtgaccgatg gtaagtgagg attcctagga ggaataacag tctctggttt 3120 
ccctgtttgc ctttgatatt atacactgta gaatactcac aagtccaaat atgctaaaaa 3180 
ttatatattt ttaactcaca tacgaaaagg ttgcaggata tttgtaggca gtcagttacc 3240 
ttcaccttat gaaatgtttc actgagttat ttaaggtttt ttggaaagcc tactattgcg 3300 
tttcaatgtg aactttgaaa tcttattgtg catccttaca caccttccat ggtgctttct 3360 
tggaaagatc attgggatgg aaggatcatt gattgggtga agatcattga ttaggtgaag 3420 
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gattatttct atccaatttg tgaagaagga ggactttgct tttaaaatta agtatcatct 3480 
gaattagcat ttgggagtgg cgaaaaacaa tgtaaaacta tgatgtcact caccattctg 3540 
ataatgttca gggtgccttt ctcctaccag gagagtactg tggcttagag gaaagaaatg 3600 
gtctatcaac tgaacatgaa atggagcagg ccaagacctt aggacattgg gatttttgtg 3660 
ggaggagagt aataggtaat tagacactga ttgtgtggta gaaatactgc aggggaaaag 3720 
gtcgccctct tatgcatcaa agagcaatac ctgttgttta gcaaagagtg atgaaaaatt 3780 
gatcttgttt tgaaattgaa gagagaggcc aggcgcggtg' gctcacacct gtaatcccag 3840 
cactttggga ggctgaggca ggtggatcac ctgaggtcgg gagttcgaga ccagcctgac 3900 
caacatggag aaaccccaat tgtactaaaa atacaaaatt agccgggcgt ggtggcaggt 39 60 
gcctgtaatc ccagctactt gggaggctga ggcaggagag tgcttgaacc caggaggcgg 4020 
aggttgtggt gagccaagat catgccactg cactccagcc tgggcaacaa gaacaaaact 4080 
ccatctaaag aagaaaaaga aaaactgaag agagagtagt tgtcagagtg aagagtttaa 4140 
tcaacagaag tagagtaaca tattctttag tagaactcac attatctgct tgaagcacat 4200 
tttgctaatg ttttattcag aatgtatcat agatataatg ttttaactaa taaaatagaa 4260 
tggtttttct ct 4272 

<210> 59 

<211> 1823 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 3041036CB1 

<400> 59 

ggcagagcgg cgagccggtg agtgtggctg cggggtctgc gcccaccctc tacctgtgcc 60 
gggggccgga acctg^cccc ggggtagggc gcggcctcgg ggagacggcg tgcaggccca 120 
gcactgttgg tcaccgcagg cctgtcttcc tattcaccgt gggcagcaga gggatggctg 180 
ctcagctgct gactgatgag gcactggaat cagtgacgtt cagggatgtg actgtggact 240 
tcacccagga ggagtggcag cagctggagc ctgcccagaa ggacctgtac agggatgtca 3 00 
tgctggagaa ctacaggaac ctggtctcac tggactggga gactagacct gaaatgaaag 3 60 
agttggatcc aaagaatgac atttcggaag acaagctctc cgttgttggg gaggccacgg 420 
ggggacccac gaggaatggt gccaggggtc ctggctcaga aggagtgtgg gaaccaggca 480 
gctggccaga gaggccgcgg ggagatgcag gtgcagagtg ggagccattg ggaattcccc 540 
aggggaacaa actcttaggg ggctcagtac ccgcatgtca tgaactgaag gcatttgcca 6 00 
accaaggctg tgtcctggtc ccaccacggc tggacgaccc cacagaaaag ggggcctgtc 660 
cacccgtaag gcgtggcaag aacttctcca gcacttcaga cctcagtaag ccccccatgc 720 
cctgcgagga gaagaaaacc tacgactgca gcgagtgtgg caaggccttt agccgaagct 780 
cgtccctgat aaagcaccaa aggatccaca cgggagaaaa gccgtttgag tgtgacacct 840 
gtgggaagca cttcatcgag cgctcgtccc tcaccatcca ccagcgggtg cacacgggcg 900 
agaagcccta tgcctgcggg gactgcggca aggccttcag ccagcgcatg aacctcactg 960 
tgcaccagcg cacgcacacg ggcgagaagc cgtatgtgfcg cgacgtgtgt ggcaaggcct 1020 
tccggaagac ttcctctctc acccagcacg agcggatcca cacgggggag aagccctacg 1080 
cgtgcgggga ctgcggcaag gccttcagcc agaacatgca cctcatcgtg caccagcgca 1140 
cgcacaccgg ggagaagccg tacgtgtgcc ccgagtgcgg gcgagccttc agccagaaca 1200 
tgcacctgac cgagcaccag cgcacgcaca ccggggagaa gccgtacgcc tgcaaggagt 126 0 
gcggcaaggc cttcaacaag agctcctcgc tcaccctgca ccagaggaac cacaccggcg 1320 
agaagcccta cgtgtgcggc gagtgcggca aggccttcag ccagagctcc tacctcatcc 1380 
agcaccagcg cttccacatc ggcgtgaagc cgttcgagtg cagcgagtgc ggcaaggcct 1440 
tcagcaagaa ctcctcgctc acgcagcacc agcgcatcca caccggcgag aagccctacg 1500 
agtgctacat ctgcaagaag cacttcacgg ggcgctcgtc cctcatcgtg caccagatcg 156 0 
tgcacaccgg ggagaagccc tacgtgtgcg gcgagtgcgg caaggccttc agccagagcg 1620 
cctacctcat cgagcaccag cggatccaca ccggcgagaa gccgtacagg tgcggccagt 1680 
gcgggaagtc cttcatcaag aactcctccc tcactgtgca ccagcggatc cacacgggcg 1740 
agaagcccta ccgatgcggc gagtgcggga agaccttcag ccgcaacacg aacctgacgc 1800 
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gccacctgcg gattcacacc tga 1823 

<210> 60 

<211> 2253 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc__f eature 

<223> Incyte ID No: 3856879CB1 

<400> 60 

gcgggatgcc gttccttcgc gcgtgaggct gcggctctga cggtgtgtaa cttgtatgtg 60 
gaaggaccaa cgtagcattt gcctcagcaa catccttgaa tctgcaccct cacccacatc 120 
cattcttatc agccccatag gctccttcaa tttccgtgat cctcggagtc cccaggagac 180 
caggtgatgg cagcagccag actcctgcca gtgccggcag gaccccaggc caagctgacc 240 
ttcgaggatg tggctgtgct cctctcccag gatgaatggg accgcctgtg ccctgctcag 300 
aggggcctct acagaaatgt gatgatggaa acctatggga atgtagtctc attgggactt 360 
ccaggatcca agcctgacat aatctcccag ctggagcgag gggaagatcc ctgggtcctg 420 
gacaggaagg gggctaagaa gagccagggc ctgtggagtg actactcaga caacctcaaa 480 
tatgaccaca ctacagcctg tacacaacaa gacagtttat cttgtccatg ggaatgtgaa 540 
accaagggag agagtcaaaa tacagacttg agtccgaagc cattaatttc agagcaaaca 600 
gtgattctgg ggaaaacacc cttggggagg attgatcaag aaaataatga aacaaagcaa 660 
agcttctgtc tgagtccaaa ctctgttgac caccgtgaag ttcaggtctt aagccaaagc 720 
atgccactca ctccgcacca ggcagtgcct agtggagaga ggccctacat gtgtgttgag 780 
tgtgggaagt gctttggccg gagttcccac ctccttcagc atcagcgtat ccacactgga 840 
gagaagccct atgtgtgcag tgtatgtggg aaggccttca gccagagctc agtccttagt 900 
aaacacagga gaattcacac aggtgagaag ccctatgagt gtaatgagtg t.ggaaaagcc 960 
tttagagtga gctcagatct tgctcagcat cacaagatac atacaggaga gaagcctcac 102 0 
gaatgtcttg agtgtcggaa agccttcact caactctcac atctcattca gcaccagcgg 1080 
atccacacgg gagaaaggcc atatgtgtgt ccgttgtgtg ggaaagcctt caaccatagc 1140 
actgttctgc ggagccacca gagggtacac actggggaga agcctcacag gtgcaatgag 1200 
tgtgggaaaa ccttcagtgt gaagaggaca ctgctgcagc accagaggat ccacaccggg 1260 
gagaagccct acacgtgcag cgagtgtggg aaggccttca gcgaccgctc agtcctcatt 1320 
cagcaccaca acgtgcacac cggggagaag ccctatgagt gcagtgagtg tgggaagacc 1380 
ttcagccacc gctccacact gatgaatcac gagcggatcc acaccgagga aaagccctat 1440 
gcatgctacg aatgtgggaa ggccttcgtt cagcactcac acctgatcca gcaccagaga 1500 
gtccacactg gggagaagcc ctatgtgtgt ggtgaatgtg ggcacgcctt cagtgcacgc 1560 
cggtctctga tccagcatga gagaatccac acaggtgaaa agcccttcca gtgcacagaa 1620 
tgtggcaaag ccttcagcct gaaagcaact ctgattgtgc acctgaggac ccacacgggc 1680 
gagaagccat atgagtgcaa tagctgcggg aaggccttca gccagtactc agtgctcatc 1740 
cagcaccagc ggatccacac aggcgagaag ccctatgagt gcggggagtg tgggcgtgcc 1800 
ttcaaccagc atggccacct aatccagcac cagaaagtgc acagaaagtt gtgacccatg 1860 
gctgacacaa gaatccattc tcacagaaac tgcatgtgga accacaagca gccttcagcc 1920 
caagagaagt ctctgttaac tctataggaa gcttttcttt ggcgattcag tgtcacaaaa 1980 
taactccaga aagaagcact tagcgtgctg ttcctgtgga aaaacttcag agactacctg 2040 
ttttattttc ctcaacatct tgaagttatg ttggagagta atcatacaat tgtagagaat 2100 
tttggtaaaa aacagccata attctttaac attagtttat ttgaactaag ggaatttaag 2160 
gcataagaac cattatccca ataaaatctt acattccaaa taaagttctt tttctaagaa 2220 
cattaaaaaa aaaaaaaaaa aaaaaaaaaa aaa 2253 

<210> 61 

<211> 2788 

<212> DNA 

<213> Homo sapiens 
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<220> 

<221> misc_f eature 

<223> Incyte ID No: 4178665CB1 

<400> 61 

ccgcgcccgg ctataatctc tcattttcta agctcatgtg cctaatttca aattatataa 60 
ctttattgaa atattcttta agccactgac ttctagtaag aaccgttttc actttgtttc 120 
ctattttgaa aatgtaaact ttatgttatg ctggctacag gaaaataatt tttgtttgct 180 
tttatgtttt ctttcaggtt tgttatccag gcataagacc aagaaattat cttcagaaaa 240 
ggacattcat gaaatcagtt tatccaaaga gagtataata gaaaaaagta aaactcttcg 3 00 
tctgaaagga tccattttta gaaatgagtg gcagaacaaa agtgagtttg agggtcaaca 360 
gggacttaaa gaaagatcta tcagtcaaaa gaaaatcgtc tctaaaaaaa tgtcaactga 420 
tagaaaacgt ccctctttta ctctgaatca gagaattcac aatagtgaga aaagctgtga 480 
ctcacacttg gttcaacatg ggaaaataga ttctgatgtg aaacatgatt gtaaagaatg 540 
tgggagtact tttaataatg tctatcagct tactctccat cagaaaattc atactggtga 600 
aaaatcctgt aaatgtgaga aatgtgggaa agtttttagt catagctatc aacttactct 660 
gcatcagaga tttcatactg gtgagaaacc ctatgaatgt caagaatgtg ggaagacctt 720 
tactctttac ccacaactta atcgacatca gaaaattcac actggtaaaa aaccctatat 780 
gtgtaagaaa tgtgataagg gtttttttag tagattagaa cttactcaac ataaaagaat 840 
tcatactggt aagaaatctt atgaatgtaa agaatgtgga aaagtttttc aacttatttt 900 
ctactttaaa gaacatgaga gaattcatac aggtaagaaa ccctatgaat gtaaggagtg 960 
tgggaaagct tttagtgtat gcggacaact tacccgtcat cagaaaattc atactggtgt 1020 
aaaaccctac gaatgtaagg aatgtggaaa gacctttaga cttagttttt accttactga 1080 
acacagaaga actcatgcag gtaagaaacc ttatgaatgt aaggagtgtg ggaaatcatt 1140 
taatgtgcgt ggacagctta atcggcataa aacaatccat actggtataa aaccttttgc 1200 
atgtaaggtg tgtgagaagg cttttagtta tagtggtgac ctcagagtac attctagaat 126 0 
tcatactgga gagaaaccat atgaatgtaa ggaatgcggg aaagccttta tgcttcgttc 1320 
agtccttact gaacatcag? gacttcatac tggtgtgaag ccctacgaat gtaaggaatg 1380 
tgggaagacc tttcgagttc gttctcaaat tagtctacat aagaaaattc atactgatgt 1440 
gaagccctac aaatgtgtac gatgtgggaa gacctttaga tttggtttct accttactga 1500 
acaccagaga attcacactg gtgaaaagcc ctataaatgt aaagaatgtg gaaaggcctt 1560 
tattcgtaga gggaatctta aagaacatct gaaaattcat tctggtttaa aaccctatga 1620 
ctgtaaagaa tgtgggaagt cctttagtcg gcgtgggcag ttcactgaac atcagaaaat 1680 
tcatacgggt gtaaaaccat acaaatgtaa agaatgtggg aaggccttta gtcgtagtgt 1740 
agaccttaga atacatcaaa gaattcatac tggtgagaaa ccctatgagt gtaaacaatg 1800 
tgggaaggcc tttagactta attcacacct tactgaacat cagagaattc acactggtga 1860 
gaaaccctat gagtgtaagg tatgtagaaa ggcctttaga caatattcac atctttatca 1920 
acatcagaaa actcataatg taatttaata taagaaaagg tttccatgtc atgctctatt 1980 
tatagaatat caaaatattt atggccagaa gttctgtcaa tgtgttgatg tttttttaca 2040 
catattaact taataaatgt atgagtctta aatacctctt agttctcatt aaatttagga 2100 
aaattcacac tagaaaataa agctgttaat gtaacagttg tggaaaagtg ttctagcaac 2160 
agcatatact tatcatcatt gcctttccac tactctacta tctgtgtgat attagacaaa 2220 
atatttgctt cttggtacct cagctgtaaa atgaaacaca cctaaaagtg tggttgtttc 2280 
caacatgtat aatacagcaa caactatctg gcccaaactg ctttggatta atattggata 2340 
ttactgtttt tattatcatc aacattatta ttagtggatt tcttaatagg aagatgcaat 2400 
ggagatgaca aatttggaaa aaccactcat cacttacatt tcatgaagta cttctttgat 2460 
aaaatctgtt atgggctgaa tgtttgtgtt cccgtaacaa ttcctatgtt gaaacacgaa 2520 
tcccaaggtg atggtatttg aaggtagggc ctttaggagg aaattaggtc atgagggtgg 2580 
agccttcatg agtggaatta ctgcctttat aagaagaagc caaagagcca gctagctctt 2640 
tcaaccacat gaggttacag caagaagtca gcagtctaca gtgcaaaaga gggccttcac 2700 
cagaacccaa gcatgctggc accttgacct tggactttca gcctccaaaa ctgtgagaaa 2760 
taaacctcag ttgtttataa aaaaaaaa 2788 

<210> 62 
<211> 3041 
<212> DNA 
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<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7493326CB1 

<400> 62 

gaggaggccg agagatgagg ccggagccca ccaagttctg ggaagttctt tctgacacaa 60 
gtaagataac atctctgcca tgcccacagg tgctgagaac aagagaaatc aagtgaagga 120 
aggaggcgga gtttccaaga cttgggtgtc atcatttctg gggacatcct tgattggaga 180 
ttgaagtttt tgaaccgaaa tttagagctg attcagaaga gacaaataca gaacgtccaa 240 
gttagcagtc atttcccaaa tttaggagac aatgatgcag gcccaggaat ccctaacact 300 
ggaggatgtg gctgtggact tcacctggga ggagtggcag ttcctgagcc ctgctcagaa 360 
ggacctgtac cgggacgtga tgttggagaa ctacagcaac ctggtggcag tggggtatca 420 
agccagcaaa ccagatgcac tctccaaatt ggaacgagga gaagaaactt gcacaacaga 480 
agatgaaatc tactctcgaa tctgttctga ttcaggaggt gcatcaggag gtgcatatgc 540 
agaaatcagg aaaattgatg atcctctgca gcatcacttg caaaatcaaa gtattcagaa 600 
gagtgtgaaa cagtgccatg aacagaatat gtttggaaat attgttaatc agaacaaagg 660 
tcatttcctg ctgaagcaag attgtgatac gtttgactta catgaaaaac ctttaaaatc 720 
aaatttaagt tttgaaaacc agaaaaggag ctctggccta aagaactctg ctgagtttaa 780 
tagagatggg aaatcccttt ttcatgctaa ccataaacaa ttttatactg aaatgaagtt 840 
tcctgcaatt gcaaaaccta ttaataagtc ccagttcatt aagcaacaga gaactcacaa 900 
catagagaat gcccatgtat gcagtgaatg tgggaaagcc ttcctcaagt tgtctcagtt 960 
tattgatcat cagagagttc acactggaga aaaacctcat gtatgcagta tgtgtgggaa 1020 
agctttctcc agaaaatcca gactaatgga ccatcagaga actcatacag aactgaaaca 1080 
ttatgaatgc actgaatgtg acaaaacctt cctcaagaaa tcacagctca atatacatca 1140 
gaaaactcat atgggaggga aaccttacac atgtagccaa tgtgggaaag ccttcatcaa 1200 
gaagtgtcgg ctcatttatc atcaacgaac tcatacagga gagaaacccc atggfitgcag 1260 
tgtatgtggg aaggccttct ctacaaagtt cagtctcact acacatcaga aaactcatac 1320 
aggagaaaaa ccttatatat gtagtgaatg tggaaaaggc ttcattgaga agaggcgtct 13 80 
tactgcacat catcgaactc atactggtga gaaacccttt atatgcaata aatgtgggaa 1440 
aggcttcacc ttgaagaaca gtcttatcac acatcagcaa actcatacag gagagaaatt 1500 
atatacatgt agtgaatgtg gaaaaggctt ttcaatgaag cactgtctca tggtacatca 1560 
acgaactcat actggagaga aaccttataa atgcaatgag tgtggaaagg gcttcgcttt 1620 
gaagagccca ctcatcagac atcagcgaac acatactgga gagaaaccct atgtatgcac 1680 
cgaatgtcga aaaggtttca ccatgaagag tgacctcatt gtacatcagc gaactcatac 1740 
tgcagagaag ccatatatat gcaatgattg tggaaaaggc ttcactgtga agagccgcct 1800 
tattgtgcat cagcgaactc atactggaga aaaaccctat gtatgtggtg agtgtggaaa 1860 
aggctttcca gcaaagatcc ggctaatggg acatcaacga actcatacag gagagaaacc 1920 
ttatatttgc aatgagtgtg gaaaaggctt cactgagaag agtcatctca atgtacatcg 1980 
gcgcactcat acaggagaga aaccctatgt atgcagtgaa tgtggcaaag gcttactggg 2040 
aagagcatgc tcattgcacc atcaggcgaa ctcatactgg ggggagaaac cttatatatg 2100 
caatgaatgt ggaaagggct tcagcatgaa gagtactctc agtatacatc agcaaactca 2160 
tactggagag aagccataca aatgcaatga atgtgataaa accttcagga agaagacatg 2220 
cctcatacaa catcagcgat ttcacacagg aaagacttcc tttgcatgta ctgaatgtgg 2280 
aaaattctct ttgcgcaaaa atgatcttat tacacatcag agaattcaca caggagagaa 2340 
accgtacaaa tgcagtgact gcgggaaagc cttcactaca aaatcagggc tcaatgttca 2400 
tcaaagaaaa catacaggag agaggcccta tggatgtagt gattgtggga aagcttttgc 2460 
gcacttgtct atccttgtta aacacaagag aattcacagg tagtcatttt gggaaagcct 2520 
cttgccagat gtaggccctt aagatatctg caaagaagag taatttcatg aatgcagact 2580 
acatggttgt ttatttagtg atcagttact tcatgttttg tgtcagagaa aacatgtaca 2640 
aaacatttga gaaaatattt taggacatta tgtctaaaaa ttgtatactg agaaaaatcc 27 00 
tatgaatgtg gcagactata aaagcctttg gtgggaagat aaaccccctc agaagtgatc 2760 
atagatcatg aataaactac taattcgtgg aaatgtaata attataggaa cgtctttgcc 2820 
caaaaataaa acctcaacag atttgagaga gttcacactg gagaaaatac ttttcctctg 2880 
gcaggtgaat tgtggcaggg gttttcacgt cacaatgctt ggcctcatcg gtgttgctaa 2940 
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gaaataattt ccgaaaaaaa cttattggac acccttccat gtttcacaga ctttgggaaa 3000 
caccgagaat ttttgaagga aaaaccttgg aaatgttgtg c 3041 

<210> 63 
<211> 3445 
<212> DMA 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 1553836CB1 

<400> 63 

ctggccggag ccctgggtga aattgttagg cgtggagagg gagtgatgtc ttccagactc 60 
ggtgctgtac ccgcctagca ggacccagtg ctccataaag gataatagtt tccagtacac 120 
tatccctcat gatgactcct taagtggttc atcgtctgca tcttcgtgtg aaccagtgag 180 
tgattttcca gcatctttcc gaaaatctac ctactggatg aagatgagaa gaatcaagcc 240 
agctgctact tctcatgtcg aagggtcagg tggagtatca gccaagggga aaaggaaacc 300 
caggcaggaa gaagafcgaag actatcgaga atttcctcag aagaagcata agctttafcgg 360 
gaggaagcaa cggcctaaaa ctcagcccaa tcccaaatcc caggcccgtc gtattcggaa 420 
ggaaccacca gtttatgcag caggcagttt ggaggagcaa tggtacttag aaatcgttga 480 
taaaggcagt gtctcctgcc ctacctgcca ggcagtgggg aggaagacca tagagggttt 540 
aaagaaacac atggaaaact gcaagcagga aatgtttact tgtcatcatt gtgggaaaca 600 
acttcgttca ctggcaggga tgaagtatca tgtcatggca aatcataata gtttgcccat 660 
tttgaaagcc ggagatgaaa tagatgagcc aagtgagagg gaaaggctcc gaacagttct 720 
aaagagactg ggaaagctca ggtgcatgcg tgagagttgc tccagtagct tcaccagcat 7 80 
catgggatat ctctaccatg tcagaaaatg tggcaaaggg gctgcagagc tggaaaagat 840 
gaccctgaaa tgtcaccact gtggaaaacc atataggtcg aaggctggac ttgcatatca 9 00 
cctgaggtca gagcatgggc ctatatcctt ctttccagag tcaggacagc cagagtgctt 960 
aaaggagatg aacctagagt caaagagtgg gggccgagtt cagagacgtt ctgccaagat 1020 
agctgtatac cacctacagg agctggcctc tgctgaactg gccaaggaat ggcccaagag 1080 
gaaggtgctt caggacctgg tacctgatga tcgaaagtta aaatatactc gtccagggct 1140 
ccctaccttc agccaggaag tactacataa atggaagaca gatatcaaga aatatcatcg 1200 
tattcagtgt cctaaccagg gctgtgaggc tgtctacagc agtgtatctg gccttaaagc 1260 
tcacctgggc tcttgtacat tgggaaactt tgtggctgga aaatacaaat gtcttctatg 1320 
tcagaaagaa tttgtgtcag agagtggtgt caagtatcac atcaactccg tccatgctga 1380 
ggactggttc gttgtaaacc caacaacaac caaaagcttt gaaaagctga tgaagataaa 1440 
gcagcggcag caagaagaag aaaagcggag gcagcagcac aggagcagaa ggtctctaag 1500 
aaggcggcag cagcctggca ttgagcttcc cgagacagag ctgagtctta gagtagggaa 1560 
ggatcagagg aggaataatg aggaactggt agtgtcagcc tcctgtaagg aaccagagca 1620 
ggagccagtg ccagcacagt tccagaaagt aaagccccca aagactaatc ataaacgagg 1680 
aaggaaatag gcagtcagtg taaaagtgct cctaggaaag cagatgtgat gctgttgtca 1740 
cggggacctg tgctgggagg actgagtgaa gattctttca gctgtttgtg taaggctgtg 1800 
actttctcag ctccttcctc ctgtgtaaat ttcactgtct tcccttctta cattttgatt 1860 
cccccctccc cttttagtag gttttcctgc tattcctcgt caagtcctct tgttttttta 1920 
tcttgcccaa agagctccct ctcaaggcca actataggct cctcttgccc tgtacaaaac 1980 
taagaaacct ctttggttgt cctttccttc ctggggtata gaatgttctt ggaagctcca 2040 
ttgatttagt agctgctcca tcatctagct tgtgaaacca ttccaaacta attttttaaa 2100 
accataactg atttgtcatt ttgtatttgt gatataacaa gtctagaagt tagaactgtt 2160 
gtcattcaca taataagatt actctgtctc cttgggaaaa aaactttatg gaggctgttt 2220 
gtcctctcaa tatggtttta agattgaaag taagaaaacg gatttaggat gaaaactcta 2280 
gaactaccct atgctgttta tactgggaaa tgctttgtac caagtagcag tgactagacc 2340 
cacagacatg aaaagcaacc ttaggagtaa agtgacccaa acattaaaat gacaggaaag 2400 
agaaagtaga agcagcaata aatactgccc caactttcct ggagcccgag gcctcatcca 2460 
tagctatgat cacttgccct ctgaagctta tttttgcttc tttggtttta agaattgaga 2520 
aatatcacat tgcccctgat gttttgacag tctcctagtg ctcctggtag tgccactaag 2580 
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ggaaaaccaa ggtgcgcatt ccttctccct ggactttacc ttacttgtta gtctacgccc 2640 

cactgtttcc acccatcccc ttagccaacc tctgtctttt tgaattttct gagaatattg 2700 

tcctatcctc ttttatatat ggagttctct cctctttata tcctgagact ttgacaccag 2760 

atgtagatat ttatctggag ctggaaagaa aaattctttt tctgtacctc atgcctatct 2820 

ggtaatgttt aatgggttat ttctctttga gggtggcttt ctctggaaca ttggttagag 2880 

cagcttfcgtt gtcgtgtttc tggattctct tccccatttt gcgtaaatat tggtcttata 2940 

tattcttgcc tattttgtgg catatgccac ataaaaaatg aacctgatat agacaagtac 3000 

taccttttca aattctgaaa ggctattacc actttaactc tttgtgctcc tccaaatagc 3060 

tttaaaatgt gggcttttgt gaagaccact ttcaaacaag ggagcactga aacctgaatt 3120 

ggatactgcc agaataggta gttttgaaac aagttaagga catggtatat gcactctgca 3180 

ttttcattgg cagtgtgccc ttaaagccct ttcagtagat gagggttgtc agggaggaga 3240 

aatgaagaag ctatgttaat ttctggtgag taagacctgg ggaatgtttg gcaatgacaa 3300 
aagaaataaa tgactctcag aaagatgttt ccagtgttct ttgacgccga cctgctgcat 3360 
gactccttga cactgtatgg gggtaagaag atggctagag atgggggtga gtttgaaata 3420 
aattccacat gcagttgtct cagtg 3445 

<210> 64 
<211> 2929 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 1908201CB1 

<400> 64 

ccggcctgcc ttccttcctt ccgcccgccc gccggcccgc cctaagagtt gaccacgccg 60 
caatgaagag cttcttacag tgaaagcaaa gtaggtcgag tccactgaaa atgcctttaa 120 
gggacaaata ctgtcagact gaccaccatc atcacggatg ctgtgaacca gtttatatcc 180 
tggaacctgg agatcctcct ttgfctacagc aaccactaca gacatccaaa tctggtattc 240 
aacaaataat tgagtgcttt cgatcaggaa ctaaacaact taagcatatt ttattaaaag 300 
atgtggacac tatttttgaa tgtaagttat gccgcagtct cttcagagga ttaccaaatt 3 60 
taattaccca taaaaaattc tactgcccac caagtctcca gatggatgac aaccttcctg 420 
atgtaaatga taaacaaagc caagccataa atgatctcct agaagccata tatccaagtg 480 
tggacaaacg agaatatatt attaagctag aacccataga aactaatcaa aatgcagtat 540 
ttcaatatat ttcgaggact gataatccta ttgaagtcac agagtcaagc agtactcctg 600 
aacaaaccga agttcagata caggaaacta gcactgaaca gtcaaaaaca gtaccggtta 6 60 
cagatacaga ggtggaaact gtagagcccc ctcctgttga gattgttaca gatgaagttg 720 
cacctacatc tgatgaacaa cctcaggagt cgcaggctga cttggaaact tctgacaatt 780 
ctgattttgg tcaccagttg atatgttgtc tttgtagaaa agaattcaat tctagacgag 840 
gtgttcgccg tcacattcga aaagtacaca agaaaaagat ggaagaacta aaaaagtaca 900 
ttgaaacacg aaagaatcca aaccaatcct ctaaaggacg cagtaagaat gttctagttc 960 
cattaagtag gagttgtcca gtatgttgta aatcatttgc tacaaaagcg aatgtaagga 1020 
ggcattttga tgaagttcat agaggactaa ggagggattc aattactcct gatatagcaa 1080 
caaagcctgg gcaacctttg ttcctggatt ctatttctcc taaaaaatct tttaagactc 1140 
gaaaacaaaa gtcttcttca aaggctgaat acaatttaac tgcatgcaaa tgcctccttt 1200 
gcaagaggaa atatagttca caaataatgc ttaaaagaca tatgcaaatt gtccacaaga 1260 
taactctttc tggaacaaac tctaaaagag aaaaaggccc taataatact gccaacagtt 1320 
cagaaataaa agttaaagtt gaaccagcag attctgtaga atcttcaccc ccttccatta 1380 
cccattctcc acagaatgaa ttaaagggaa caaatcattc aaatgaaaaa aagaacacac 1440 
cggcagcaca gaaaaataaa gttaaacaag actctgaaag ccctaaatca actagtccgt 1500 
cggctgcagg tggccagcaa aaaaccagaa aaccaaaact ttcagctggc tttgacttta 1560 
agcaacttta ctgtaaactt tgtaaacgtc agtttacttc caaacagaac ttgactaaac 1620 
acatcgagtt gcacacagat ggaaataaca tttatgttaa attctacaag tgtcctcttt 16 80 
gcacttatga aactcgtcgg aaacgtgatg tgatacgaca tataactgtg gttcataaaa 1740 
agtcatctcg ttatcttggg aaaataacag ccagtttaga gatcagagct ataaaaaagc 1800 
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ctattgattt tgttctaaat aaagtggcaa aaagaggccc ttcgagggat gaagcaaaac 1860 
atagtgattc aaaacatgat ggcacttcta actctcctag taaaaagtat gaagtagctg 1920 
acgtcggtat tgaagtaaaa gtcacaaaaa acttttctct tcacagatgc aataaatgtg 1980 
gaaaggcatt tgccaaaaag acttaccttg aacatcataa gaaaactcat aaggcaaatg 2040 
cttccaattc acctgaagga aacaaaacca aaggccgaag tacaagatct aaggctcttg 2100 
tctggtgagg aacagttaac agagttttgc ttttttcccc ccatcgaact aaaaaaaaaa 2160 
tcatttgacc ataatttata gctggttcca ttttaacacg tttgcttcca tatatctcat 2220 
ggcaatggga actgcaagag taatgtgcat attgcattta cctcttcagt gacctttatt 2280 
ccagtggctt gggaacaaaa gttaacttca gaacttatct tccacaggac aatgcaatgt 2340 
agttgtaggt agatggcaca gggtcagtgc tttcctttta atgttgtaaa atatatacat 2400 
ttatattggc ttatgtttac aatagaagtc ttctgtttaa ctaactttgc acaggtttaa 2460 
tttgattcag tgacttagtc tactaattaa tgaattgtag gaaagttaaa tatattagaa 2520 
tgaacttgtg aaggcgataa ctataggaaa aaatcttttg aggcactgta attattgtaa 2580 
aaattaatct gtgacggcta aataaaatgc tgcactcaga aaaaaaaatc cccgaattga 2640 
attaaaaatg attagcattt atcatttact tttaacatag tgcttccagg caaaggggga 2700 
aagttggtgg taaatagaag tttggatttt ttttccttct ccttgggagt gtgtgttagt 2760 
tactgttttt tagttttgat ctatggaaac aatgcttatt taggtcccaa atgcttcccc 2820 
tctcttttta ataaactttt tggtgatcat tttaacatag ttaatattat taactaggga 2880 
tfctgtttttt gtggttgtgt gtcgtaagaa aagtaaaatt catatttgg 2929 

<210> 65 
<211> 1923 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 2827615CF1 

<400> 65 

ccccggccag gatcgagccc tggcccgggc cctggcccag ccccggcctc caaggaccgc 60 
gccgaaggag ctcgactttc tcaggatact gtccctctcc cacagaggag ctgaaggagt 120 
aggacagaag aactgtcaaa ttctggaatc cttaaagcca tgtccaagga tttggtgaca 180 
tttggggatg tggctgtaaa tttctctcaa gaggaatggg aatggctgaa ccctgctcag 240 
aggaatttgt acaggaaagt gatgttggag aactacagga gcttggtatc attggcagga 300 
gtttc : tgttt ctaagccaga tgtgatctca ttattggagc aaggaaaaga gccctggatg 360 
gtgaagaagg agggaacaag aggcccatgc cctgattggg agtatgtatt taaaaacagt 420 
gaattttcat caaagcaaga gacatatgaa gaatcatcca aagttgtgac agtgggagca 480 
agacatctta gttatagcct tgactatccc agtttgagag aagactgtca aagtgaggac 540 
tggtataaga accagctggg aagtcaagag gtacatctta gtcaattaat catcactcat 600 
aaagaaatcc ttccagaagt tcaaaataaa gaatataaca aatcttggca aacattccac 660 
caggatacaa tctttgatat acaacagagt tttcccacca aagaaaaagc acataagcat 720 
gaaccacaaa agaaaagtta ccgaaaaaaa tctgttgaaa tgaaacatag gaaagtctat 780 
gtagaaaaga aacttttgaa atgtaatgat tgtgagaaag tcttcaacca gagctcatcc 840 
cttactcttc atcagagaat tcatactgga gagaaaccct atgcatgtgt tgaatgtggg 900 
aaaacgttca gccagagtgc aaacttggcg caacataaga gaatacatac tggagagaaa 960 
ccctatgaat gtaaagaatg taggaaagcc ttcagccaga atgcacacct ggcccaacat 1020 
cagagagttc atactggaga gaaaccttat cagtgtaaag aatgtaaaaa agccttcagc 1080 
cagattgcac acctgactca gcatcagaga gttcatactg gagagagacc tttcgaatgt 1140 
attgaatgtg gaaaggcctt tagtaatggt tcatttcttg ctcagcatca gagaattcat 1200 
acaggagaga aaccttatgt gtgtaatgtg tgtgggaaag cctttagcca tcgtggatac 1260 
ctaattgtac atcagagaat tcatactgga gagagaccct acgaatgtaa ggaatgtagg 1320 
aaagccttca gccagtatgc acaccttgct caacatcaga gagttcatac tggagaaaaa 1380 
ccttatgaat gtaaagtatg taggaaagcc ttcagccaaa ttgcatacct tgatcaacat 1440 
cagagggttc atactggaga gaaaccctat gaatgtattg aatgtgggaa ggcctttagc 1500 
aatagttcat cacttgcaca acatcagaga agtcatactg gagaaaaacc ctatatgtgt 1560 
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aaggaatgta ggaaaacatt- tagccagaat gcaggccttg ctcaacatca gagaattcat 1620 

actggagaga aaccttatga atgfcaatgtt tgtgggaaag cgtttagcta tagtggatct 1680 

cttactctac atcagagaat tcatactgga gaaagaccct atgaatgtaa agattgcagg 1740 

aaatctttca ggcagcgtgc acatcttgct catcatgaga gaattcatac tatggagtca 1800 

ttcttgactc tttcctctcc ctcaccctcc acatcaaatc agttgccaag acctgtaggt 1860 

ttcatctcct gaatatttct ggaatccacc tcttgaatcc atttccatcc catcatcctt 1920 



<210> 66 

<211> 5601 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: 4304550CB1 

<400> 66 

actccctccc cgcggggcgc gcagctcgcg ggtctttgga caccaccggt cctgagtccg 60 
cggactgcca ttttcattaa gaactgccac ttagaggtac caaaataaag ggtattfcgct 120 
acctttaata cttgccagtfc caggttggag gcacaggcag cagcaagaat ggaaagaaat 180 
gttcttacaa cattttcaca ggaaatgtcc cagttaattt tgaatgaaat gccaaaagct 240 
gaatattcca gtttattcaa tgattttgtt gaatctgaat tttttttgat tgatggggat 300 
tcattactta tcacatgtat ctgtgagata tcatttaagc ctgggcagaa cctccatttc 360 
ttctatctgg ttgaacgcta tcttgtggat cttattagca aaggaggaca attcaccata 420 
gttttcttca aggatgccga gtatgcgtat ttcaacttcc ctgaacttct ttctttgaga 480 
actgctttaa tccttcatct tcagaagaat accaccattg atgttcgaac aacattttcg 540 
agatgcttat caaaagagtg gggaagtttc ttggaagaga gttacccata tttcctgata 600 
gttgcagacg aaggcctgaa cgatctacaa acacagcttt tcaacttttt aatcattcat 66 0 
tcttgggcaa ggaaggtcaa cgttgtactt tcctcagggc aagaatctga tgttctttgc 720 
ctttatgcat accttcttcc aagcatgtac agacaccaga ttttttcctg gaagaataag 780 
cagaacatta aagatgctta tacaaccctg cttaaccagt tggaaagatt taagctttca 840 
gcattagcac ctctttttgg aagtttaaaa tggaataata ttacggaaga ggcacacaag 900 
actgtatctc tgcttacaca agtctggcca gaaggatctg acattcggcg tgtcttttgt 960 
gttacttcat gctcattatc tttgagaatg taccatcgct ttttaggaaa cagagagccc 1020 
tcctctggtc aggaaactga gatccaacag gtgaacagta attgcttaac cctgcaggag 1080 
atggaagatt tgtgtaaact gcattgtctc actgtggttt ttctactcca tctgcctctt 1140 
tctcaaagag cttgtgctag agtcatcact tcccattggg ctgaggacat gaagccttta 1200 
ttacaaatga aaaagtggtg tgaatatttc atcttaagaa atatacatac ttttgaattt 1260 
tggaatctga atttaattca cctttctgac ttaaatgatg agcttttgtt gaagaatatt 1320 
gctttttact atgaaaatga aaatgtaaaa ggcctacatt tgaatttggg agataccatt 1380 
atgaaagatt atgaatatct ctggaatacc atatcaaagt tggtcagaga ctttgaggtt 1440 
ggacagccat ttcctctgag aacaacaaaa gtttgttttc ttgaaaagaa accatcacca 1500 
atcaaagaca gctccaatga aatggtgccc aatttgggtt ttattccaac gtcatctttt 1560 
gtggttgata aatttgctgg agatattttg aaagatttgc cttttctaaa gagtgatgat 1620 
cctattgtta cttcactggt taaacaaaag gaatttgatg aacttgtgca ctggcattct 1680 
cataaacccc tgagtgatga ttatgacagg tccaggtgtc agtttgatga aaaatctaga 1740 
gaccctcgtg ttcttagatc tgtgcaaaag tatcatgttt tccaacggtt ttatgggaat 1800 
tcattagaaa cagtctcttc gaaaatcatc gtgactcaaa ctattaagtc aaagaaggat 1860 
tttagtgggc ccaagagcaa aaaggcacac gagaccaagg ctgaaataat tgctagagag 1920 
aataagaaaa ggttatttgc cagggaagaa caaaaggaag agcaaaagtg gaatgctttg 1980 
tcattttcta ttgaagagca attgaaagaa aatttacact ctggaataaa gagcctggaa 2040 
gattttttga aatcctgtaa aagtagctgt gtgaaacttc aggttgaaat ggtggggtta 2100 
actgcttgct tgaaagcctg gaaagaacat tgccgaagtg aagaaggtaa aaccacgaaa 2160 
gatttaagta tagctgttca ggtgatgaaa aggatccact ccttgatgga aaaatactca 2220 
gaacttttac aagaagatga tcggcaactc atagccagat gccttaagta tttaggattt 2280 
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gafcgagttgg caagttcttt acatccagcc caggatgcag aaaatgatgt aaaagtgaag 2340 
aaaaggaata. aatattcagt tggcattggg ccagctcggt tccaactgca atacatgggc 2400 
cattatttga tacgagatga gagaaaagac ccagatccca gggtccagga ttttattccc 2460 
gacacatggc agcgagagct ccttgatgtt gtggataaga atgagtcagc agtgattgtt 2520 
gccccaacgt cctcaggcaa aacctatgcc tcctactact gtatggagaa agtgctgaag 2580 
gagagcgacg acggggtggt cgtgtacgtt gcacccacaa aggcccttgt taatcaagtg 2640 
gcagcaactg ttcagaatcg ttttacgaaa aatctgccaa gtggtgaagt tctctgtggt 2700 
gttttcacca gggagtatcg tcatgatgcc ttaaactgtc aggtacttat tacagtgcct 2760 
gcctgctttg aaattctgct gcttgctcct catcgccaaa actgggtgaa aaagatcaga 2820 
tatgttatat ttgatgaggt tcattgtctt ggtggagaaa ttggagcaga aatctgggaa 2880 
catctccttg tcatgatccg atgtcccttt ttggctcttt cagctaccat aagtaatcct 2940 
gaacatctca ccgagtggct acaatcggta aaatggtact ggaaacaaga agacaaaata 3000 
attgaaaata ataccgcttc taaaagacat gtgggtcgtc aggccggctt tcccaaagac 3060 
tacttgcaag taaaacaatc gtataaagtt agacttgtgc tctatggaga gaggtataat 3120 
gatctagaga agcatgtatg ttcaataaaa catggtgaca ttcattttga tcattttcac 3180 
ccatgtgctg cactaacaac agatcatatt gaaaggtatg gattccctcc tgatcttacc 3240 
ctttcacctc gagaaagcat ccagctgtat gatgccatgt ttcaaatttg gaaaagttgg 3300 
cctcgggccc aggaactgtg cccagaaaac ttcattcatt ttaacaataa attagtcatt 3360 
aaaaagatgg atgctaggaa atatgaagag agtctaaagg cagaattaac- aagttggatt 3420 
aaaaatggca acgtagagca ggccagaatg gtacttcaga atcttagtcc tgaagcagat 3480 
ttgagtccag aaaacatgat caccatgttt ccacttctag ttgaaaaact aaggaaaatg 3540 
gagaagttac ctgcactatt ttttttattc aagttaggag ctgtagaaaa cgcagctgaa 3600 
agtgtgagca ctttcctaaa gaaaaagcag gagacaaaaa ggcctcccaa agctgataaa 3660 
gaagcccatg tcatggctaa caaacttcga aaagttaaaa aatccataga gaaacaaaag 3720 
atcatagatg aaaagagcca gaaaaaaacc agaaatgtgg atcaaagcct aatacatgaa 3780 
gctgaacatg ataatctagt gaagtgtcta gagaagaacc tggaaatccc acaggactgc 3840 
acatatgctg atcaaaaagc agtggacact gagactttgc agagggtatt tggtcgagta 3900 
aaatttgaaa gaaaaggtga agaattgaaa gccttggcag aaaggggtat tggatatcat 3960 
cacagtgcta tgagtttcaa agaaaaacaa ttagttgaaa tcctctttag aaaaggatat 4020 
cttagggtgg tgacagctac tggaacactt gctttaggtg tcaacatgcc ttgtaaatct 4080 
gtggtttttg ctcaaaactc agtctatctg gatgcgttga attatagaca gatgtctggc 4140 
cgtgctggaa gaagaggtca agacctgatg ggagatgtat atttctttga tattccattc 4200 
cccaaaatag gaaaactcat aaaatccaat gttcctgagc tgagaggaca cttccctctc 4260 
agcataaccc tggtcctgcg actcatgctg ctggcttcca agggagatga cccagaggat 4320 
gccaaggcaa aggtgctatc agtgctaaag cattcattgc tgtccttcaa gcaacccaga 4380 
gtcatggaca tgttaaaact ttacttcctg ttttctttgc agttcctggt gaaagagggc 4440 
tatttagatc aagaaggtaa tcctatgggg tttgctggac ttgtatcaca tttgcattat 4500 
catgaacctt ctaatcttgt ttttgtcagt tttcttgtaa atggactctt ccatgatctc 4560 
tgtcagccaa ccaggaaagg ctcaaaacat ttttctcaag acgttatgga aaagctagta 4620 
ttagtattgg cacatctctt tggaagaaga tattttccac caaagttcca agatgcacac 4680 
ttcgagtttt atcaatcaaa ggtgttcctt gatgatctcc ctgaggattt tagtgatgct 4740 
ttagatgaat ataacatgaa aattatggag gactttacca ctttcctacg aattgtttcc 4800 
aaactggctg atatgaatca ggaatatcaa ctcccattgt caaaaatcaa attcacaggt 4860 
aaagaatgtg aagactctca actcgtatct catttgatga gctgcaagga aggaagagta 4920 
gcaatttcac catttgtttg tctgtctggg aactttgatg atgatttgct tcgactagaa 4980 
actccaaacc atgttactct aggcacaatc ggtgtcaatc gctctcaggc tccagtgctg 5040 
ttgtcacaga aatttgataa ccgaggaagg aaaatgtcgc ttaatgccta tgcactggat 5100 
ttctacaaac atggttcctt gataggatta gtccaggata acaggatgaa tgaaggagat 5160 
gcttattatt tgttgaagga ttttgcactc accattaaat ctatcagtgt ttccttgcgt 5220 
gagctatgtg aaaatgaaga cgacaacgtt gtcttagcct ttgaacaact gagtacaact 5280 
ttttgggaaa agttaaacaa agtctaaaaa caaagtctat gcaaaccact taaaaataat 5340 
tccatagtag tttttcaggt cacgtttttg attcttatgc ttcttgccag aaatacatta 5400 
tgataaagtg gaaatacatt acgatgaagt ggaaagagca aacactttgg aatcaaacag 5460 
agttgcaatc aaacctgcca tgttctgtca tgaatactca caaattattt agtatacctg 5520 
aatcttggtt tctttttata actgagtaat aatggttaca tgtcagggga tccactagtt 5580 
taagcgccgc ccccgtgtca g 5601 
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<210> 67 

<211> 2802 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7473738CB1 

<400> 67 

gaccccctac gtcatcaggg cgcgtcctcg cctttcccct cccatctcct cggctccgtg 60 
gacgtgttcg cttcttctcc ggaccgggtc tatgtcccgg tttcccgccg tcgcgggcag 120 
ggcgccaagg cggcaggagg aaggcgagcg gccaatagag ctccaggaag agcggccgtc 180 
agctgttcgc atcgctgaca gagaagagaa aggatgcaca tcacaggagg gaggaacaac 240 
tccaacgttt cctattcaga aacaaagaaa aaagctcatt caagctgtga gggacaactc 300 
gttcctcatc gttactggaa atacaggaag tggtaaaaca actcaactcc ctaaatacct 360 
ttatgaagca gggttttcac aacatggtat gattggtgtg actcaaccac gaaaagtagc 420 
tgctatatca gttgctcaga gagtagctga agaaatgaaa tgcactttgg gatctaaagt 480 
aggataccaa gttcgttttg atgattgcag ttctaaggag acagcaatca aatatatgac 540 
tgatggatgt ttactgaaac atattctggg agacccaaat cttaccaaat tcagtgtcat 600 
tattttggat gaagcccatg aaagaactct aactacagat atcttatttg gtttactgaa 660 
gaagctattt caggagaagt ctcctaatag gaaggagcat ttaacaagtg gtggtacatg 720 
tcatgcaact atggaattag ccaagctctc tgcattcttfc ggaaattgcc ccatatttga 780 
tatacctgga agactctatc cagtcagaga gaaattctgc aatttgattg gtccacgaga 840 
cagagaaaat actgcgtata ttcaagcgat tgtgaaagtc accatggata tccatttgaa 900 
tgaaatggct ggagacatct tggtttttct gactggccag tttgaaatag aaaaaagttg 960 
tgagttactt tttcagatgg cagagtctgt tgattatgat tatgatgttc aagataccac 1020 
cctcgatggc ttgttaatat tgccgtgtta tggatcaatg acaacagatc aacagaggag 10 80 
gatatttttg ccaccaccac ctggaattag aaaatgtgtc atatccacca atatttctgc 1140 
aacgtctttg acaatagatg gaatcagata tgtggtagat ggtggctttg tgaagcagtt 1200 
aaatcacaac cccagattag ggttggacat cctggaggtg gttccaattt caaagagcga 12 60 
ggcattacag cgaagtggcc gagctggcag gacttcttca ggaaaatgct ttcggatcta 13 20 
tagtaaagat ttttggaacc agtgtatgcc tgaccatgtg atccctgaaa ttaagagaac 1380 
tagtttgaca tctgtagttc tgaccttaaa gtgccttgcc atacacgatg tcataaggtt 1440 
tccctatttg gatccaccta atgagagact tattttagaa gctcttaaac aactttacca 1500 
gtgtgatgct attgacagga gtggccatgt caccagattg ggtttgtcta tggtggagtt 1560 
tcctttgcct ccacatctga catgtgcagt aataaaagct gcttccctgg attgtgaaga 1620 
tctactactt ccaatagcag caatgttgtc tgtggaaaac gtcttcatta gacctgttga 1680 
tccagagtac cagaaggaag cagaacagag acatcgagaa ttggcagcta aagctggagg 1740 
atttaatgac tttgcaactt tagctgtcat ctttgaacaa tgcaaatcaa gtggagctcc 1800 
agcttcatgg tgccaaaaac actggattca ttggaggtgc ttattttctg catttcgtgt 1860 
ggaagctcaa cttcgagaac taatcaggaa gcttaaacag caaagtgatt tcccaaaaga 1920 
gacttttgaa ggccctaaac atgaagtact acgaagatgt ctttgtgcgg gctacttcaa 19 80 
aaatgttgct cgaagatctg ttggcagaac attttgcaca atggatgggc gtggaagtcc 2040 
agttcatatc catccttcct cagcacttca tgaacaggaa accaaacttg aatggataat 2100 
ttttcatgaa gtattagtta ccaccaaagt ctatgcaaga attgtgtgcc caatccgtta 2160 
tgaatgggtg agagacttac tacccaagtt gcatgaactt aatgcgcatg atttgagcag 2220 
tgtggctaga cgtgaaatga gagaagatgc aagaagaaaa tggacaaata aggaaaatgt 2280 
gaaacagcta aaggatggaa tatcaaaaga agttctaaag aaaatgcaaa gaagaaatga 2340 
tgacaaatcc atatcagatg cacgggctcg cttccttgag agaaagcaac agagaatcca 2400 
ggaccacagt gacacattaa aggaaactgg ctaagcagtg gactatccaa ttcaggaagc 2460 
aggaaagcag ccaggaaatg tgcttctgtt ttgccagtta tttccgacag tactatcaga 2520 
aggaagtggt cagcagctgt tcctagcctg tgagccgaaa gcaaatcaga atttataaat 2580 
cacatctcat caatactaac aaatgacatt ttgaaaggaa agattgggtt gcatagtcat 2640 
agtacatgaa tccaatcaaa gatgatacaa tatttccctc acttctgttt tgctggcctt 2700 
aactggtatc aaacagtgtc actgagatat tttcaaagaa cactgaattg tatttaatca 27 60 
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gcgtgtattc catttgcatt gaagcattaa agattatttt cc 2802 

<210> 68 
<211> 2157 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_.f eature 

<223> Incyte ID No: 4447743CB1 

<400> 68 

gggcttcagc tctctgcgtt ctcggctccg ggaggcctcg gtgattcagc cacagcctct 60 
gcctcccgtt gctctgtgac ctgagggtat tggacaattt gtagctaaga ctcccggata 120 
ccctgaagtc gggaaatgga actcgtaaca ttcagggatg tggccataga attctcccct 180 
gaagagtgga aatgtctgga ccctgcccag cagaatttgt atagagatgt gatgttggag 240 
aactacagga acctggtctc cctgggtttt gtgatctcta acccagacct ggtcacctgt 300 
ctggagcaaa taaaagagcc ctgcaatttg aagatacatg agacagcagc caaaccccca 360 
gctatatgtt ctcctttcag ccaagacctt tcaccagtgc aggggataga agattcattc 420 
cacaaactta tactgaaaag atacgagaaa tgtggacatg agaatttaca attaagaaaa 480 
ggctgtaaac gtgtgaatga gtgtaaggtg cagaaaggag ttaataatgg agtttaccag 540 
tgcttgtcaa ctacccagag caaaatattt caatgtaata catgtgttaa agtttttagt 600 
aaattttcaa attcaaacaa acataagata agacat-actg gagagaaacc ctttaaatgt 660 
acagaatgtg gcagatcgtt ttacatgtca cacctaactc aacatacagg aattcatgct 720 
ggagagaaac cctacaaatg tgaaaaatgt ggcaaagcct ttaataggtc cacatcactt 780 
agtaaacata agagaattca tactggagag aaaccctaca catgtgaaga atgtggcaaa 840 
gcctttagac ggtccacagt tctgaacgaa cataagaaaa ttcatactgg agagaaaccc 900 
tacaaatgtg aagaatgtgg caaagccttt acaaggtcca caacactgaa tgaacacaag 960 
aaaattcata ctggagagaa accctacaaa tgtaaagaat gtggcaaagc ctttagatgg 1020 
tccacaagcc tgaatgaaca taagaatatt catactggag agaaacccta caaatgtaaa 1080 
gaatgtggca aagcctttag acagtccagg agcctgaatg aacataaaaa tattcatact 1140 
ggcgaaaaac cctacacatg tgaaaaatgt ggcaaagctt ttaaccaatc ctcaagtctt 1200 
attatacaca ggagcattca ttctgaacaa aaactttaca aatgtgaaga atgtggcaaa 1260 
gcctttactt ggtcctcatc ccttaataaa cataagagaa ttcatactgg agagaaaccc 1320 
tacacatgtg aagaatgtgg caaagctttt tataggtcct cacaccttgc taaacataag 13 80 
agaattcata ctggagagaa accctacacg tgcgaagaat gtggcaaagc ttttaaccaa 1440 
tcctcaactc ttatattaca caagagaatc cattctgggc aaaaacctta caaatgtgaa 1500 
gaatgtggca aagcctttac acggtccaca acactgaacg aacataagaa aattcatact 1560 
ggcgagaaac cctacaaatg tgaagaatgt ggcaaagctt tcatatggtc cgcaagcctg 1620 
aatgaacata agaatattca tactggagag aaaccctaca aatgtaaaga atgtggcaaa 1680 
gcttttaacc aatcctcagg ccttattata cacaggagca ttcattctga acaaaaactt 1740 
tacaaatgtg aagaatgtgg caaagccttt actcggtcca cagccctgaa tgaacataag 1800 
aaaattcatt ctggagagaa accctacaaa tgcaaagaat gtggcaaagc ctataactta 1860 
tcctcaaccc ttactaaaca taagagaatt catactggag agaaaccctt cacatgtgaa 1920 
gaatgtggca aagccttcaa ttggtcctca tcccttacta aacataagat aattcatact 1980 
ggagagaaat cctacaaatg tgaagaatgt ggcaaagctt ttaatcggcc ctcaaccctt 2040 
actgtacaca agcgaattca tactggcaag gaacatagtt gaatgacatt tctagtaatc 2100 
tctaattcca gtgtctttac acagcaaata aattggagaa tattgctccc atataaa 2157 

<210> 69 
<211> 2104 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> mi sc_f eature 
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<223> Incyte ID No: 7497554CB1 
<400> 69 

gcgcctgccc cggtgggtgg gaagcaggac tcgggcgctc atgcagcgag cgggcggcgc 60 

tcggggcgct agttcctagc agctgggcca gcgtaggggc gcaggfccgga tccggcagag 120 

gagggcggag gaggacgcag ggggagggtg gagagacccg cgagccgcag tctcagcctc 180 

gtccgacgcg cctccgcctc tcccgggccg ggcccggtgg gcgctcagag cttgagggcg 240 

ccggctgctc cctcggtagc gggggcaagc ggaggcaggg gtgtgggcgg ctaaaatgag 300 

tgaaaggaga agatctgcag tcgccctgag ctcgcgagca catgccttct ccgttgaagc 360 

cttgatcggc tcaaataaaa aacggaaact gcgagactgg gaggagaagg ggctggacct 420 

gtctatggag gcgctgag'cc ccgcgggccc actcggagac acggaggacg cggcggcaca 480 

cggcctggag cctcacccgg attctgagca gagcactggt tcagattctg aggtcctcac 540 

tgagcggact tcctgctcct tcagtactca cactgacctg gcctctggtg ctgcaggccc 600 

tgtgcctgct gccatgtctt ccatggagga gattcaggtg gagctgcaat gtgctgacct 660 

ctggaagcgg ttccatgata ttggaactga aatgatcatc accaaagcag gcaggaggat 720 

gtttcctgcc atgagagtga aaatcactgg cctagatcca aatcagcagt actacatagc 780 

aatggacatt gtgcctgtgg acaataaaag atacagatat gtgtatcata gctccaagtg 840 

gatggtggct ggcaatgctg attcccctgt gcccccaaga gtttafcatac accctgattc 900 

tctagcttct ggagacacct ggatgagaca ggtggtcagt tttgacaaac tcaagcttac 960 

caacaatgag ttggatgatc aaggacatat cattctgcac tctatgcaca aataccagcc 1020 

tcgagttcat gtgattcgca aagacttcag cagtgacctt tcacccacta agcctgttcc 1080 

tgttggggat ggggtgaaaa cgttcaactt tcctgagact gtgttcacca cagttacggc 1140 

ctatcagaat cagcagatta ccagattaaa aattgaccga aacccttttg ctaaaggatt 1200 

cagagattct gggagaaaca gaactggact tgaagccatc atggagacat atgcattctg 1260 

gagacctcct gtgcgcacac tcaccttcga agacttcacc accatgcaga agcagcaagg 1320 

aggcagcaca ggcacttccc caaccacctc cagcactggg acaccatccc cttcggcttc 1380 

ttctcatctt ttatctccat cctgttctcc ^tccaactttt catctggccc ccaacacttt 1440 

caatgtgggc tgccgagaaa gccagctgtg taatctaaac ctctctgatt atccaccatg 1500 

tgcccgaagc aacatggctg ccttgcagag ctacccaggg ctgagtgaca gtggctacaa 1560 

caggcttcag agtggcacca cttcagccac tcagccctct gaaaccttca tgcctcagag 162-0 

gactccatcc ctgatctcag gaataccaac tcctccctcg ttgcctggca acagcaagat 1680 

ggaagcctac ggtggccagc tggggtcctt tcccacttcc cagtttcagt atgtcatgca 1740 

ggcaggcaat gctgcctcca gctcctcatc accacacatg ttcgggggca gccacatgca 1800 

gcagagctcc tacaatgcct tctcccttca caacccttac aacctgtatg gatacaattt 1860 

ccccacttcc cctaggctag ctgcaagccc ggaaaaacfcg agcgcctctc aaagcacttt 1920 

actctgttct tctccttcca acggggcctt tggagagagg cagtacctgc cgtcagggat 1980 

ggagcacagc atgcacatga ttagcccttc acccaataac caacaggcaa ccaacacttg 2040 

tgatggccgg cagtatgggg cagttccagg ctcctcctcc cagatgtccg tgcacatggt 2100 
ttaa 2104 

<210> 70 
<211> 1451 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_f eature 
<223> Incyte ID No: 7475843CB1 

<400> 70 

tttcagtctc ttctataagt aagaaccagt tctctctgtc tctcttgtgc 

attaagaact ctgcccatga ccacttgata cgtatgtgtt ttttctttgc 

tgatattcag ggatgtggcc atagaattct ctccggagga gtggagctat 

ctcagcagaa tctgtatagg gacgtgatgt tagaaaacta cagaaacctg 

gtattgctgt ctctaagcca gacctgatca cctgtctaga gcaaaggaat 

atgtgaagaa acatgagaca gtagccagac acccagctgt ttcttctcat 
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acctcttgcc agagcatggt ataaaagatt catttcaaaa agtgatactg agaagatatg 420 
gaagctatgg cattgagaat ttacaattaa agaaagattg ggaaagtgtg ggtgaatcta 480 
aggtgcagaa agaatgttgt aatggactta accaatcttt atcaactaca cataccaaaa 540 
tctttcaatt taataaatgt gtgaaagtct ttagtaaatc gtcaaatcta aatagacata 600 
agataagaca tactggagag atatcttcca actgtaaaga atgtgacaat tccttttaca 660 
tatcctcagt tctaactcca cttcagagaa tccacactgc agagaaatcc tacaagtgta 720 
aacaatgtgg gaaagccttt aggcactgct catgctttct tgaacatgag acaattcata 780 
atgaagagaa acattacaaa tgtaaagaat gtggaaaagt ctttaaatcc ttcacaagcc 840 
tttctaatca cattataatt catactggaa agaaactcta taaatgtgaa gaatgtggca 900 
aagcttttaa ccacagttca aaccatgcca aacataagaa aattcacact ggacagaaac 960 
cccataaatg tgaagaatgt ggcaaagcct ttaactggtt ctcataccta actctacata 1020 
aaagaattca tactggagag aaaccctaca aatgtgatga atgtggcaaa gcttttaacc 1080 
agtgttcaaa cctcactaaa cataagagaa ttcatactgg agagaaaccc tacaaatgtg 1140 
aagaatgtgg caaagctttt aaccggtgct cacaccttac tgaacataaa agaattcata 1200 
ctggagagaa gccctataaa tgtgaagaat gtggtaaagt ctttatatct tgttcaagcc 1260 
tttcaaacca taagagaatt catacaagag agaaatgcta caaatctgaa gaatgtggca 1320 
aaacctttaa ccactgctca gacctcaatg tacctgagaa aattcatacc tgagaaaaat 1380 
cctacaaatg taaaaaatgt ggcaaagcct ttaatacctg ctcatgtctt actcaggacc 1440 
agagttcata t 1451 

<210> 71 

<211> 1609 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 6319550CB1 

<400> 71 

agggaggctg cgtgtgccgg ggctaggggc tggaagtcct ggctctagtt gcacctcgga 60 
aggaaaaggc aaacagagga gggaaggcgt cttaggactg cctggatcca gagcactttc 120 
cacggcctct acaggcctgt gtcgctatgg gttcccccgc cgccccggag ggagcgctgg 180 
gctacgtccg cgagttcact cgccactcct ccgacgtgct gggcaacctc aacgagctgc 240 
gcctgcgcgg gatcctcact gacgtcacgc tgctggttgg cgggcaaccc ctcagagcac 300 
acaaggcagt tctcatcgcc tgcagtggct tcttctattc aattttccgg ggccgtgcgg 360 
gagtcggggt ggacgtgctc tctctgcccg ggggtcccga agcgagaggc ttcgcccctc 420 
tattggactt catgtacact tcgcgcctgc gcctctctcc agccactgca ccagcagtcc 480 
tagcggccgc cacctatttg cagatggagc acgtggtcca ggcatgccac cgcttcatcc 540 
aggccagcta tgaacctctg ggcatctccc tgcgccccct ggaagcagaa cccccaacac 600 
ccccaacggc ccctccacca ggtagtccca ggcgctccga aggacaccca gacccaccta 660 
ctgaatctcg aagctgcagt caaggccccc ccagtccagc cagccctgac cccaaggcct 720 
gcaactggaa aaagtacaag tacatcgtgc taaactctca ggcctcccaa gcagggagcc 780 
tggtcgggga gagaagttct ggtcaacctt gcccccaagc caggctcccc agtggagacg 840 
aggcctccag cagcagcagc agcagcagca gcagcagcag tgaagaagga cccattcctg 900 
gtccccagag caggctctct ccaactgctg ccactgtgca gttcaaatgt ggggctccag 960 
ccagtacccc ctacctcctc acatcccagg ctcaagacac ctctggatca ccctctgaac 1020 
gggctcgtcc actaccggga agtgaatttt tcagctgcca gaactgtgag gctgtggcag 1080 
ggtgctcatc ggggctggac tccttggttc ctggggacga agacaaaccc tataagtgtc 1140 
agctgtgccg gtcttcgttc cgctacaagg gcaaccttgc cagtcaccgt acagtgcaca 1200 
caggggaaaa gccttaccac tgctcaatct gcggagcccg ttttaaccgg ccagcaaacc 1260 
tgaaaacgca cagccgcatc cattcgggag agaagccgta taagtgtgag acgtgcggct 1320 
cgcgctttgt acaggtggca catctgcggg cgcacgtgct gatccacacc ggggagaagc 1380 
cctacccttg ccctacctgc ggaacccgct tccgccacct gcagaccctc aagagccacg 1440 
ttcgcatcca caccggagag aagccttacc actgcgaccc ctgtggcctg catttccggc 1500 
acaagagtca actgcggctg catctgcgcc agaaacacgg agctgctacc aacaccaaag 1560 

99/101 



BNSDOCID: <WO 03000864A2J_> 



WO 03/000864 



PCT/US02/21179 



tgcactacca cattctcggg' gggccctagc tgagcgcagg cccaggccc 1609 

<210> 72 
<211> 2840 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7510064CB1 

<400> 72 

ggagcggcga ctggcgagcc atggcgctgg ggctgcagcg cgcaaggccg gccctttcct 60 
gtggagtcat ctcaccgccg tgcgcaccca ctcgtaactc gcacccgggt cctggctgca 120 
ccgcatcccc tcctgcaccc cctggatggc ccttcagcca acgggggcct gggcgatggt 180 
cgaccacgga gctgcgcaag gaaaagtccc gggatgcggc ccgcagccgg cgcagccagg 240 
agaccgaggt gctgtaccag ctggctcaca cgctgccctt cgcccgcggc gtcagcgccc 300 
acctggacaa ggcctctatc atgcgcctca ccatcagcta cctgcgcatg caccgcctct 360 
gcgccgcagg ggagtggaac caggtgggag cagggggaga accactggat gcctgctacc 420 
tgaaggccct ggagggcttc gtcatggtgc tcaccgccga gggagacatg gcttacctgt 480 
cggagaatgfc cagcaaacac ctgggcctca gtcagctgga gctcattgga cacagcatct 540 
ttgatttcat ccacccctgt gaccaagagg agcttcagga cgccctgacc ccccagcaga 600 
ccctgtccag gaggaaggtg gaggccccca cggagcggtg cttctccttg cgcatgaaga 660 
gtacgctcac cagccgcggg cgcaccctca acctcaaggc ggccacctgg aaggtgctga 720 
actgctctgg acatatgagg gcctacaagc cacctgcgca gacttctcca gctgggagcc 780 
ctgactcaga gcccccgctg cagtgcctgg tgctcatctg cgaagccatc ccccacccag 840 
gcagcctgga gcccccactg ggccgagggg ccttcctcag ccgccacagc ctggacatga 900 
agttcaccta ctgtgacgac aggattgcag aagtggctgg ctatagtccc gatgacctga 960 
tcggctgttc cgcctacgag tacatccacg cgctggactc cgatgcggtc agcaagagca 1020 
tccacacctt gctgagcaag ggccaggcag taacagggca gtatcgcttc ctggcccgga 1080 
gtggtggcta cctgtggacc cagacccagg ccacagtggt gtcaggggga cggggccccc 1140 
agtcggagag tatcgtctgt gtccattttt taatcagcca ggtggaagag accggagtgg 1200 
tgctgtccct ggagcaaacg gagcaacact ctcgcagacc cattcagcgg ggcgccccct 1260 
ctcagaagga cacccctaac cctggggaca gccttgacac ccctggcccc cggatccttg 1320 
ccttcctgca cccgccttcc ctgagcgagg ctgccctggc cgctgacccc cgccgtttct 1380 
gcagccctga cctccgtcgc ctcctgggac ccatcctgga tggggcttca gtagcagcca 1440 
ctcccagcac cccgctggcc acacggcacc cccaaagtcc tctttcggct gatctcccag 1500 
atgaactacc tgtgggcacc gagaatgtgc acagactctt cacctccggg aaagacactg 1560 
aggcagtgga gacagattta gatatagctc agatgaggaa actgaagctc agactgttga 1620 
ccacaggcac agaactcaga agtgatggtg ctgggacttc agccaaggtc cacccaagtc 1680 
caaggctcat cctcttacct ccctcctgcc ctccgcagga tgctgatgct ctggatttgg 1740 
agatgctggc cccctacatc tccatggatg atgacttcca gctcaacgcc agcgagcagc 1800 
tacccagggc ctaccacaga cctctggggg ctgtcccccg gccccgtgct cggagcttcc 1860 
atggcctgtc acctccagcc cttgagccct ccctgctacc ccgctggggg agtgaccccc 1920 
ggctgagctg ctccagccct tccagagggg acccctcagc atcctctccc atggctgggg 1980 
ctcggaagag gaccctggcc cagagctcag aggacgagga cgagggagtg gagctgctgg 2040 
gagtgagacc tcccaaaagg tcccccagcc cagaacacga aaactttctg ctctttcctc 2100 
tcagcctgag tttccttctg acaggaggac cagccccagg gagcctgcag gaccccactg 2160 
aacttaccca attccttctt tcagtcttaa gttttcccat tctagacccc taccctctag 2220 
gctgtgctgc tcctggactt catgcctctc cattctcatt gcctacaatc tctgtgcccc 2280 
agaaccccct ccacttccca ccccagccct ccagacatgc acttaccttg actttacccc 2340 
acatgtttgg ggcacctggg gctccctcac cccttgggtg gtttgcaatc tgaagacttc 2400 
tccagccaca caggcacatg cacaggcacg gtgctgtctg catattgcca ggtggggaga 2460 
gaagccagga cccctcagct gtctgccacc atctatgtgc ctcccttacc ccccagcttt 2520 
ctttctacag atggtgctac tcttggtctc ccacaggaaa aggcctcccc ccttcttagc 2580 
cccatttacc ccgtttgtgg aaggcactgc tcgctctgtt ttgtcagaga gtggcctatc 2640 
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cagattggtg cfcatgggggg gtctgacccc 
ctcaatggag ggaattgtgc tgggctaggg 
ctctgaaact caccaatctc tatacaccat 
aaaaaaaaaa aaaaaaaatt 



tccctcctcc ctctggaggt gatgtgggcc 2700 
aaaggggagg gactagactg gccacactgg 2760 
aaagacctca ccttggtagg caccagaaaa 2820 

2840 
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(54) Title: NUCLEIC ACID-ASSOCIATED PROTEINS 

(57) Abstract: Various embodiments of the invention provide human nucleic acid -associated proteins (NAAP) and polynucleotides 
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and antagonists. Other embodiments provide methods for diagnosing, treating, or preventing disorders associated with aberrant 
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of SEQ ID NOS : 2-36, respectively. 

Group 181, claim(s) 26, drawn to a method of screening a compound that specifically binds to polypeptide of SEQ ID NO : 1. 

Groups 182-216 , claim(s) 26, drawn to a method of screening a compound that specifically binds to each of the polypeptide of SEQ 
ID NOS : 2-36, respectively. 

Group 217, claim(s) 28-29, drawn to a method of screening a compound for assessing toxicity or effectiveness in altering the target 
nucleotide sequence of SEQ ID NO : 37. 

Groups 218-252, claim(s) 28-29, drawn to a method of screening a compound for assessing toxicity or effectiveness in altering the 
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ID NO : 1 , host cell and a method of making the polypeptide, which Groups 2-288 do not share. For the same reason. Group 2-36 has 
the special' technical feature of the nucleotide sequences of SEQ ID NOS : 37-72 and the encoded polypeptides of SEQ ID NOS : 2- 
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36, respectively, host cell and a method of making the polypeptide, which Groups 37-288 do not share. Each of the Groups 37-72 
have a special technical in a distinct antibody pertaining to each of the polypeptide sequences of SEQ ID NOS : 1-36, which Groups 
73-288 do not share. Each of the Groups 73-288, employ structurally distinct nucleotide sequences of SEQ ID NOS : 37-72 or the 
polypeptide sequences of SEQ ID NOS : 1-36 in distinct methods, however, in view of 37 CFR 1 .475(b), when claims corresponding 
to different categories of inventions are present then only (3) and additional methods of use are deemed to lack unity. Thus the 
various groups discussed above show a lack of unity of invention. 
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